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NOVEL NUCLEIC ACEDS AND POLYPEPTIDES 



1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
5 polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 

2. BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, such as 

10 lymphokines, interferons, CSFs, chemokines, and interleukins) has matured rapidly over the past 
decade. The now routine hybridization cloning and expression cloning techniques clone novel 
polynucleotides "directly" in the sense that they rely on information directly related to the 
discovered protein (i.e., partial DNA/amino acid sequence of the protein in the case of 
hybridization cloning; activity of the protein in the case of expression cloning). More recent 

15 "indirect" cloning techniques such as signal sequence cloning, which isolates DNA sequences 
based on the presence of a now well-recognized secretory leader sequence motif, as well as 
various PCR-based or low stringency hybridization-based cloning techniques, have advanced the 
state of the art by making available large numbers of DNA/amino acid sequences for proteins 
that are known to have biological activity, for example, by virtue of their secreted nature in the 

20 case of leader sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 
techniques, or by virtue of structural similarity to other genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in, for 
example, diagnostics, forensics, gene mapping; identification of mutations responsible for 
genetic disorders or other traits, to assess biodiversity, and to produce many other types of data 

25 and products dependent on DNA and amino acid sequences. 

3. SUMMARY OF THE INVENTION 

The compositions of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
30 cloned genes or degenerate variants thereof, especially naturally occurring variants such as allelic 
variants, antisense polynucleotide molecules, and antibodies that specifically recognize one or more 
epitopes present on such polypeptides, as well as hybridomas producing such antibodies. 

The compositions of the present invention additionally include vectors, including expression 
vectors, containing the polynucleotides of the invention, cells genetically engineered to contain such 
3 5 polynucleotides and cells genetically engineered to express such polynucleotides. 
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The present invention relates to ^collection or library of at least one novel nucleic acid 
sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 
hybridization (SBH), and in some cases, sequences obtained from one or more public databases. 
The invention relates also to the proteins encoded by such polynucleotides, along with therapeutic, 
5 diagnostic and research utilities for these polynucleotides and proteins. These nucleic acid 
sequences are designated as SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. The 
polypeptides sequences are designated SEQ ID NO: 985-1968,2953-3936,3943-3948 or 3955- 
3960. The nucleic acids and polypeptides are provided in the Sequence Listing. In the nucleic acids 
provided in the Sequence Listing, A is adenosine; C is cytosine; G is guanine; T is thymine; and N 
10 is any of the four bases. In the amino acids provided in the Sequence Listing, * corresponds to the 
stopcodon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
hybridizetothecomplementofSEQIDNO: 1-984, 1969-2952, 3937-3942 or 3949-3954 under 
stringent hybridization conditions; nucleic acid sequences which are allelic variants or species 

1 5 homologues of any of the nucleic acid sequences recited above, or nucleic acid sequences that 
encode a peptide comprising a specific domain or truncation of the peptides encoded by SEQ ID 
NO: 1-984, 1969-2952, 3937-3 942 or 3949-3954. A polynucleotide comprising a nucleotide 
sequence having at least 90% identity to an identifying sequence of SEQ ID NO: 1-984, 1969-2952, 
3937-3942 or 3949-3954 or a degenerate variant or fragment thereof. The identifying sequence can 

20 be 100 base pairs in length 

The nucleic acid sequences of the present invention also include the sequence information 
from the nucleic acid sequences of SEQ ID NO:l-984, 1969-2952, 3937-3942 or 3949-3954. The 
sequence infonnation can be a segment of any one of SEQ ID NO: 1 -984, 1 969-2952, 3 93 7-3 942 or 
3949-3954 that uniquely identifies or represents the sequence information of SEQ ID NO:l-984, 

25 1969-2952, 3937-3942 or 3949-3954. 

A collection as used in this application can be a collection of only one polynucleotide. The 
collection of sequence information or identifying information of each sequence can be provided on 
a nucleic acid array. In one embodiment, segments of sequence information is provided on a 
nucleic acid array to detect the polynucleotide that contains the segment The array can be designed 

30 to detect full-match or mismatch to the polynucleotide that contains the segment. The collection - 
can also be provided in a computer-readableformat. 

This invention also includes the reverse or direct complement of any of the nucleic acid . 
sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and 
host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or their 

3 5 reverse or direct complements) according to the invention have numerous applications in a variety 



2 



WO 01/57190 PCT/US01/04098 
of techniques known to those skilled in the art of molecular biology, such as use as hybridization 
probes, use as primers for PCR, use in an array, use in computer-readablemedia, use in sequencing 
full-length genes, use for chromosome and gene mapping, use in the recombinant production of 
protein, and use in the generation of anti-sense DNA or KNA, their chemical analogs and the like. 
5 In a preferred embodiment, the nucleic acid sequences of SEQ ID NO: 1 -984, 1 969-2952, 

3937-3942 or 3949-3954 or novel segments or parts of the nucleic acids of the invention are used as 
primers in expression assays that are well known in the art. In a particularly preferred embodiment, 
the nucleic acid sequences of SEQ ID NO: 1 -984, 1 969-2952, 3937-3942 or 3949-3954 or novel 
segments or parts of the nucleic acids provided herein are used in diagnostics for identifying 

10 expressed genes or, as well known in the art and exemplified by Vollrath et al., Science 258:52-59 
(1 992), as expressed sequence tags for physical mapping of the human genome. 

The isolated polynucleotides of the invention include, but are not limited to, a 
polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1 -984, 
1 969-2952, 3937-3942 or 3949-3954 ; a polynucleotide comprising any of the full length protein 

15 coding sequences of SEQ ID NO:l-984, 1969-2952, 3937-3942 or 3949-3954; and a polynucleotide 
comprising any of the nucleotide sequences of the mature protein coding sequences of SEQ ID 
NO: 1-984, 1969-2952,3937-3942 or 3949-3954. The polynucleotides of the present invention also 
include, but are not limited to, a polynucleotide that hybridizes under stringent hybridization 
conditions to (a) the complement of any one of the nucleotide sequences set forth in SEQ ID NO:l- 

20 984, 1969-2952, 3937-3942 or 3949-3954; (b) a nucleotide sequence encoding any one of the 
amino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an allelic 
variant of any polynucleotides recited above; (d) a polynucleotide which encodes a species homolog 
(e.g. orthologs) of any of the proteins recited above; or (e) a polynucleotide that encodes a 
polypeptide comprising a specific domain or truncation of any of the polypeptides comprising an 

25 amino acid sequence set forth in the Sequence Listing. 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising any of the amino acid sequences set forth in SEQ ID NO: 985-1968, 2953-3936, 3943- 
3948 or 3955-3960; or the corresponding full length or mature protein. Polypeptides of the 
invention also include polypeptides with biological activity that are encoded by (a) any of the 

30 polynucleotideshavinganucleotidesequencesetforthinSEQIDNO:l-984, 1969-2952, 3937- 
3942 or 3949-3954; or (b) polynucleotides that hybridize to the complement of the polynucleotides 
of (a) under stringent hybridization conditions. Biologically or immunologically active variants of 
any of the polypeptide sequences in the Sequence Listing, and "substantial equivalents" thereof 
(e.g., with at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% amino acid sequence 

3 5 identity) that preferably retain biological activity are also contemplated. The polypeptides of the 
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invention may be wholly or partially chemically synthesized but are preferably produced by 
recombinant means using the genetically engineered cells (e.g. host cells) of the invention. 

The invention also provides compositions comprising a polypeptide of the invention. 
Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 
5 hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 
the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 
1 0 under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
from the culture or from the host cells. Preferred embodiments include those in which the 
protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety of 
techniques known to those skilled in the art of molecular biology. These techniques include use 
15 as hybridization probes, use as oligomers, or primers, for PCR, use for chromosome and gene 
mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA 
or RNA, their chemical analogs and the like. For example, when the expression of an mRN A is 
largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used 
as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample 
20 using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
expressed sequence tags for identifying expressed genes or, as well known in the art and 
exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence tags for physical 
mapping of the human genome. 
. 25 The polypeptides according to the invention can be used in a variety of conventional 

procedures and methods that are currently applied to other proteins. For example, a polypeptide 
of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the 
polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 
30 markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical condition 
which comprises the step of administering to a mammalian subject a therapeutically effective 
amount of a composition comprising a polypeptide of the present invention and a 
pharmaceutical^ acceptable carrier. 
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In particular, the polypeptides and polynucleotides of the invention can be utilized, for 
example, in methods for the prevention and/or treatment of disorders involving aberrant protein 
expression or biological activity. 

The present invention further relates to methods for detecting the presence of the 
5 polynucleotides or polypeptides of the invention in a sample. Such methods can, for example, be 
utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 
identification of subjects exhibiting a predisposition to such conditions. The invention provides 
a method for detecting the polynucleotides of the invention in a sample, comprising contacting 
the sample with a compound that binds to and forms a complex with the polynucleotide of 
10 interest for a period sufficient to form the complex and under conditions sufficient to form a 
complex and detecting the complex such that if a complex is detected, the polynucleotide of 
interest is detected. The invention also provides a method for detecting the polypeptides of the 
invention in a sample comprising contacting the sample with a compound that binds to and forms 
a complex with the polypeptide under conditions and for a period sufficient to form the complex 
1 5 and detecting the formation of the complex such that if a complex is formed, the polypeptide is 
detected. 

The invention also provides kits comprising polynucleotide probes and/or monoclonal 
antibodies, and optionally quantitative standards, for carrying out methods of the invention. 
Furthermore, the invention provides methods for evaluating the efficacy of drugs, and 
20 monitoring the progress of patients, involved in clinical trials for the treatment of disorders as 
recited above. 

The invention also provides methods for the identification of compounds that modulate 
(i.e., increase or decrease) the expression or activity of the polynucleotides and/or polypeptides 
of the invention. Such methods can be utilized, for example, for the identification of compounds 

25 that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are 
not limited to, assays for identifying compounds and other substances that interact with (e.g., 
bind to) the polypeptides of the invention. The invention provides a method for identifying a 
compound that binds to the polypeptides of the invention comprising contacting the compound 
with a polypeptide of the invention in a cell for a time sufficient to form a polypeptide/compound 

30 complex, wherein the complex drives expression of a reporter gene sequence in the cell; and 

detecting the complex by detecting the reporter gene sequence expression such that if expression 
of the reporter gene is detected the compound the binds to a polypeptide of the invention is 
identified. 

The methods of the invention also provides methods for treatment which involve the 
35 administration of the polynucleotides or polypeptides of the invention to individuals exhibiting 



5 



WO 01/57190 PCTYUS01/04098 
symptoms or tendencies. In addition, the invention encompasses methods for treating diseases or 
disorders as recited herein comprising administering compounds and other substances that 
modulate the overall activity of the target gene products. Compounds and other substances can 
effect such modulation either on the level of target gene/protein expression or target protein 
5 activity. 

The polypeptides of the present invention and the polynucleotides encoding them are also 
useful for the same functions known to one of skill in the art as the polypeptides and 
- polynucleotides to which they have homology (set forth in Tables 2 and 9); for which they have 
a signature region (as set forth in Tables 3 and 10); or for which they have homology to a gene 
10 family (as set forth in Tables 4 and 1 1). If no homology is set forth for a sequence, then the 

polypeptides and polynucleotides of the present invention are useful for a variety of applications, 
as described herein, including use in arrays for detection. 
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4. DETAILED DESCRIPTION OF THE INVENTION 



4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular forms "a", 
"an" and "the" include plural references unless the context clearly dictates otherwise. 

The term "active" refers to those forms of the polypeptide which retain the biologic 
20 and/or immunologic activities of any naturally occurring polypeptide. According to the 

invention, the terms '"biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 
Likewise "immunologically active" or "immunological activity" refers to the capability of the 
natural, recombinant or synthetic polypeptide to induce a specific immune response in 
25 appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are engaged in 
extracellular or intracellular membrane trafficking, including the export of secretory or 
enzymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 
30 polynucleotides by base pairing. For example, the sequence 5'-AGT-3 9 binds to the 

complementary sequence 3'-TCA-5\ Complementarity between two single-stranded molecules 
may be "partial" such that only some of the nucleic acids bind of it may be "complete" such that 
total complementarity exists between the single stranded molecules. The degree of 
complementarity between the nucleic acid strands has significant effects on the efficiency and 
3 5 strength of the hybridization between the nucleic acid strands. 
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The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term "germ line 
stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a steady 
and continuous source of germ cells for the production of gametes. The term "primordial germ 
5 cells (PGCs)" refers to a small population of cells set aside from other cell lineages particularly 
from the yolk sac, mesenteries, or gonadal ridges during embryogenesis that have the potential to 
differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES cells 
are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these cells 
not only populate the germ line and give rise to a plurality of terminally differentiated cells that 

1 0 comprise the adult specialized organs* but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides which 
modulates the expression of an operably linked ORF or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably linked 
sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs 

1 5 include, but are not limited to, promoters, and promoter modulating sequences (inducible 
elements). One class of EMFs are nucleic acid fragments which induce the expression of an 
operably linked ORF in response to a specific regulatory factor or physiological event 

The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 
"oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or the 

20 sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic 
origin which may be single-stranded or double-stranded and may represent the sense of the 
antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or KNA-like material. In the 
sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and N is A, C, G or T 
(U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 

25 provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this 
invention may be assembled from fragments of the genome and short oligonucleotide linkers, or 
from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic 
acid which is capable of being expressed in a recombinant transcriptional unit comprising 
regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

30 The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 

"segment" or "probe" or "primer 55 are used interchangeably and refer to a sequence of nucleotide 
residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 
more preferably at least about 9 nucleotides, more preferably at least about 1 1 nucleotides and 
most preferably at least about 17 nucleotides. The fragment is preferably less than about 500 

35 nucleotides, preferably less than about 200 nucleotides, more preferably less than about 100 
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nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30 
nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 
preferably from about 15 to about 50 nucleotides, more preferably from about 17 to 30 
nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can 
5 be used in polymerase chain reaction (PCR), various hybridization procedures or microarray 
procedures to identify or amplify identical or related parts of mRNA or DNA molecules. A 
fragment or segment may uniquely identify each polynucleotide sequence of the present 
invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 
IDNOs:l-20. 

1 0 Probes may, for example, be used to determine whether specific mRNA molecules are 

present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh et al. (Walsh, P.S. et al., 1992, PCR Methods Appl 1 :241-250). They may 
be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods well known in the 
art. Probes of the present invention, their preparation and/or labeling are elaborated in 

15 Sambrook, J. et al., 1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, NY; or Ausubel, F.M. et al., 1989, Current Protocols in Molecular Biology, John 
Wiley & Sons, New York NY, both of which are incorporated herein by reference in their 
entirety. 

The nucleic acid sequences of the present invention also include the sequence 

20 information from the nucleic acid sequences of SEQ ED NO: 1-984, 1969-2952, 3937-3942 or 
3949-3954. The sequence information can be a segment of any one of SEQ ID NO: 1-1-984, 
1969-2952, 3937-3942 or 3949-3954 that uniquely identifies or represents the sequence 
information of that sequence of SEQ ID NO:l-984, 1969-2952, 3937-3942 or 3949-3954. One 
such segment can be a twenty-mer nucleic acid sequence because the probability that a twenty- 

25 mer is fully matched in the human genome is 1 in 300. In the human genome, there are three 
billion base pairs in one set of chromosomes. Because 4 20 possible twenty-mers exist, there are 
300 times more twenty-mers than there are base pairs in a set of human chromosomes. Using the 
same analysis, the probability for a seventeen-mer to be fully matched in the human genome is 
approximately 1 in 5. When these segments are used in arrays for expression studies, fifteen- 

30 mer segments can be used. The probability that the fifteen-mer is fully matched in the expressed 
sequences is also approximately one in five because expressed sequences comprise less than 
approximately 5% of the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 
be a twenty-five mer. The probability that the twenty-five mer would appear in a human genome 

35 with a single mismatch is calculated by multiplying the probability for a fiill match (l-r4 25 ) times the 
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increased probability for mismatch at each nucleotide position (3 x 25). The probability that an 
eighteen mer with a single mismatch can be detected in an array for expression studies is 
approximately one in five. The probability that a twenty-mer with a single mismatch can be 
detected in a human genome is approximately one in five. 
5 The term "open reading frame," ORF, means a series of nucleotide triplets coding for 

amino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to functionally related nucleic 
acid sequences. For example, a promoter is operably associated or operably linked with a coding 
sequence if the promoter controls the transcription of the coding sequence. While operably 

10 linked nucleic acid sequences can be contiguous and in the same reading frame, certain genetic 
elements e.g. repressor genes are not contiguously linked to the coding sequence but still control 
transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number of 
differentiated cell types that are present in an adult organism. A pluripotent cell is restricted in its 

1 5 differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, 
peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or 
synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a stretch of amino 
acid residues of at least about 5 amino acids, preferably at least about 7 amino acids, more 

20 preferably at least about 9 amino acids and most preferably at least about 17 or more amino 
acids. The peptide preferably is not greater than about 500 amino acids, more preferably less 
than 200 amino acids more preferably less than 150 amino acids and most preferably less than 
100 amino acids. Preferably the peptide is from about 5 to about 200 amino acids. To be active, 
any polypeptide must have sufficient length to display biological and/or immunological activity. 

25 The term "naturally occurring polypeptide" refers to polypeptides produced by cells that 

have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translational modifications of the polypeptide including, but not limited to, acetylation, 
carboxylation, glycosylation, phosphorylation, lipidation and acylation. 

The term "translated protein coding portion" means a sequence which encodes for the full 

30 length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence which encodes a peptide 
or protein without a signal or leader sequence. The "mature protein portion" means that portion 
of the protein which does not include a signal or leader sequence. The peptide may have been 
produced by processing in the cell which removes any leader/signal sequence. The mature 

35 protein portion may or may not include the initial methionine residue. The methionine residue 
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may be removed from the protein during processing in the celL The peptide may be produced 
synthetically or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 

The term "derivative" refers to polypeptides chemically modified by such techniques as 
5 ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 
attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 
substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur 
in human proteins. 

The term "variant"(or "analog 9 *) refers to any polypeptide differing from naturally 

10 occuning polypeptides by amino acid insertions, deletions, and substitutions, created using, e g., 
recombinant DNA techniques. Guidance in determining which amino acid residues may be 
replaced, added or deleted without abolishing activities of interest, may be found by comparing 
the sequence of the particular polypeptide with that of homologous peptides and minimizing the 
number of amino acid sequence changes made in regions of high homology (conserved regions) 

15 or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 
synthesized or selected by making use of the "redundancy" in the genetic code. Various codon 
substitutions, such as the silent changes which produce various restriction sites, may be 
introduced to optimize cloning into a plasmid or viral vector or expression in a particular 

20 prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part of the polypeptide, to change characteristics such as ligand-binding affinities, interchain 
affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 

25 another amino acid having similar structural and/or chemical properties, I e. , conservative amino 
acid replacements. "Conservative" amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic 
nature of the residues involved For example, nonpolar (hydrophobic) amino acids include 
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar 

30 neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 

glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 
"deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 
amino acids. The variation allowed may be experimentally determined by systematically making 
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insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 

Alternatively, where alteration of function is desired, insertions, deletions or 
non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 
can, for example, alter one or more of the biological functions or biochemical characteristics of 
the polypeptides of the invention. For example, such alterations may change polypeptide 
characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover 
rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 
for expression, scale up and the like in the host cells chosen for expression. For example, 
cysteine residues can be deleted or substituted with another amino acid residue in order to 
eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the indicated 
nucleic acid or polypeptide is present in the substantial absence of other biological 
macromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the 
polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more 
preferably at least 99% by weight, of the indicated biological macromolecules present (but water, 
buffers, and other small molecules, especially molecules having a molecular weight of less than 
1000 daltons, can be present). 

The term "isolated" as used herein refers to a nucleic acid or polypeptide separated from 
at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or 
polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide is found in 
the presence of (if anything) only a solvent, buffer, ion, or other component normally present in a 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or 
polypeptides present in their natural source. 

The term "recombinant," when used herein to refer to a polypeptide or protein, means 
that a polypeptide or protein is derived from recombinant (e.g., microbial, insect, or mammalian) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacteria] or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial" 
defines a polypeptide or protein essentially free of native endogenous substances and 
unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern in general different from those 
expressed in mammalian cells. 

The term "recombinant expression vehicle or vector" refers to a plasmid or phage or virus 
or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression vehicle can 
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comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into mRNA and translated into protein, and (3) 
appropriate transcription initiation and termination sequences. Structural units intended for use 
5 in yeast or eukaryotic expression systems preferably include a leader sequence enabling 
extracellular secretion of translated protein by a host cell. Alternatively, where recombinant 
protein is expressed without a leader or transport sequence, it may include an amino terminal 
methionine residue. This residue may or may not be subsequently cleaved from the expressed 
recombinant protein to provide a final product. 

10 The term "recombinant expression system" means host cells which have stably integrated 

a recombinant transcriptional unit into chromosomal DNA or carry the recombinant 
transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will 
express heterologous polypeptides or proteins upon induction of the regulatory elements linked 
to the DNA segment or synthetic gene to be expressed. This term also means host cells which 

15 have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant expression systems as 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells 
can be prokaryotic or eukaryotic. 

20 The term "secreted" includes a protein that is transported across or through a membrane, 

including transport as a result of signal sequences in its amino acid sequence when it is expressed 
in a suitable host cell. "Secreted" proteins include without limitation proteins secreted wholly 
(e.g. , soluble proteins) or partially (e.g. , receptors) from the cell in which they are expressed. 
"Secreted" proteins also include without limitation proteins that are transported across the 

25 membrane of the endoplasmic reticulum. "Secreted" proteins are also intended to include 

proteins containing non-typical signal sequences (e.g. Interleukin-1 Beta, see Krasney, P.A. and 
Young, P.R. (1 992) Cytokine 4(2): 1 34 -1 43) and factors released from damaged cells (e.g. 
Interleukin-1 Receptor Antagonist, see Arend, W.P. et. al. (1998) Annu. Rev. Immunol. 
16:27-55) 

30 Where desired, an expression vector may be designed to contain a "signal or leader 

sequence" which will direct the polypeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided from 
heterologous protein sources by recombinant DNA techniques. 

The term "stringent" is used to refer to conditions that are commonly understood in the 

35 art as stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization 



12 



WO 01/57190 PCT7US01/04098 
to filter-bound DNA in 0.5 M NaHP0 4 , 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 
65°C, and washing in 0:1X SSC/0.1% SDS at 68°C), and moderately stringent conditions (i.e., 
washing in 0.2X SSC/0. 1% SDS at 42°C). Other exemplary hybridization conditions are 
described herein in the examples. 
5 In instances of hybridization of deoxy oligonucleotides, additional exemplary stringent 

hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 
14-base oligonucleotides), 48°C (for 17-base oligos), 55°C (for 20-base oligonucleotides), and 
60°C (for 23 -base oligonucleotides). 

As used herein, "substantially equivalent" can refer both to nucleotide and amino acid 

10 sequences, for example a mutant sequence, that varies from a reference sequence by one or more 
substitutions, deletions, or additions, the net effect of which does not result in an adverse 
functional dissimilarity between the reference and subject sequences. Typically, such a 
substantially equivalent sequence varies from one of those listed herein by no more than about 
35% (i.e., the number of individual residue substitutions, additions, and/or deletions in a 

15 substantially equivalent sequence, as compared to the corresponding reference sequence, divided 
by the total number of residues in the substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to the listed sequence. In one 
embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies from a 
listed sequence by no more than 30% (70% sequence identity); in a variation of this embodiment, 

20 by no more than 25% (75% sequence identity); and in a further variation of this embodiment, by 
no more than 20% (80% sequence identity) and in a further variation of this embodiment, by no 
more than 10% (90% sequence identity) and in a further variation of this embodiment, by no 
more that 5% (95% sequence identity). Substantially equivalent, e.g., mutant, amino acid 
sequences according to the invention preferably have at least 80% sequence identity with a listed 

25 amino acid sequence, more preferably at least 85% sequence identity, more preferably at least 
90% sequence identity, more preferably at least 95% sequence identity, more preferably at least 
98% sequence identity and most preferably at least 98% idenity . Substantially equivalent 
nucleotide sequences of the invention can have lower percent sequence identities, taking into 
account, for example, the redundancy or degeneracy of the genetic code. Preferably, nucleotide 

30 sequence has at least about 65% identity, more preferably at least about 75% identity, more 
preferably at least about 80% identity, more preferably at least about 85% identity, more 
preferably at least about 90% identity, and most preferably at least about 95% identity, more 
preferably at least 98% and most preferably at least about 99% identity. For the purposes of the 
present invention, sequences having substantially equivalent biological activity and substantially 

35 equivalent expression characteristics are considered substantially equivalent. For the purposes of 
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determining equivalence, truncation of the mature sequence (e.g., via a mutation which creates a 
spurious stop codon) should be disregarded. Sequence identity may be determined, e.g., using 
the Jotun Hein method (Hein, J. (1990) Methods Enzymol. 183:626-645). Identity between 
sequences can also be determined by other methods known in the art, e.g. by varying 
5 hybridization conditions. 

The term "totipotent" refers to the capability of a cell to differentiate into all of the cell 
types of an adult organism. 

The term "transformation" means introducing DNA into a suitable host cell so that the 
DNA is replicable, either as an extrachromosomal element, or by chromosomal integration. The 
10 term "transfection" refers to the taking up of an expression vector by a suitable host cell, whether 
or not any coding sequences are in fact expressed. The term "infection" refers to the introduction 
of nucleic acids into a suitable host cell by use of a virus or viral vector. 

As used herein, an "uptake modulating fragment," UMF, means a series of nucleotides 
which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified 
1 5 using known UMFs as a target sequence or target motif with the computer-based systems 
described below. The presence and activity of a UMF can be confirmed by attaching the 
suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated 
with an appropriate host under appropriate conditions and the uptake of the marker sequence is 
determined. As described above, a UMF will increase the frequency of uptake of a linked 
20 marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless the 
context dictates otherwise. 

4.2 NUCLEIC ACIDS OF THE INVENTION 

25 Nucleotide sequences of the invention are set forth in the Sequence Listing. 

The isolated polynucleotides of the invention include a polynucleotide comprising the 
nucleotide sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954; a 
polynucleotide encoding any one of the peptide sequences of SEQ ID NO: 985-1968, 2953-3936, 
3943-3948 or 3955-3960; and a polynucleotide comprising the nucleotide sequence encoding the 

30 mature protein coding sequence of the polypeptides of any one of SEQ ID NO: 985-1968, 2953- 
3936, 3943-3948 or 3955-3960. The polynucleotides of the present invention also include, but 
are not limited to, a polynucleotide that hybridizes under stringent conditions to (a) the 
complement of any of the nucleotides sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 
or 3949-3954; (b) nucleotide sequences encoding any one of the amino acid sequences set forth 

35 in the Sequence Listing as SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 3955-3960; (c) a 
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polynucleotide which is an allelic variant of any polynucleotide recited above; (d) a 
polynucleotide which encodes a species homolog of any of the proteins recited above; or (e) a 
polynucleotide that encodes a polypeptide comprising a specific domain or truncation of the 
polypeptides of SEQ ID NO:985-1968, 2953-3936, 3943-3948 or 3955-3960. Domains of 
5 interest may depend on the nature of the encoded polypeptide; e.g., domains in receptor-like 
polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic domains, or 
combinations thereof; domains in immunoglobulin-like proteins include the variable 
immunoglobulin-like domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in ligand polypeptides include receptor-binding 
10 domains. 

The polynucleotides of the invention include naturally occurring or wholly or partially 
synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
region of the cDNA. 

15 The present invention also provides genes corresponding to the cDNA sequences disclosed 

herein. The corresponding genes can be isolated in accordance with known methods using the 
sequence information disclosed herein. Such methods include the preparation of probes or primers 
from the disclosed sequence information for identification and/or amplification of genes in 
appropriate genomic libraries or other sources of genomic materials. Further 5' and 3' sequence can 

20 be obtained using methods known in the art. For example, full length cDNA or genomic DNA that 
correspondsto any of the polynucleotides of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949- 
3954 can be obtained by screening appropriate cDNA or genomic DNA libraries under suitable 
hybridization conditions using any of the polynucleotides of SEQ ID NO: 1-984, 1969-2952, 3937- 
3942 or 3949-3954 or a portion thereof as a probe. Alternatively, the polynucleotides of SEQ ID 

25 NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 may be used as the basis for suitable primer(s) 
that allow identification and/or amplification of genes in appropriate genomic DNA or cDNA 
libraries. 

The nucleic acid sequences of the invention can be assembled from ESTs and sequences 
(including cDN A and genomic sequences) obtained from one or more public databases, such as 
30 dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence information, 
representative fragment or segment information, or novel segment information for the full-length 
gene. 

The polynucleotides of the invention also provide polynucleotides including nucleotide 
sequences that are substantially equivalent to the polynucleotides recited above. Polynucleotides 
35 according to the invention can have, e.g., at least about 65%, at least about 70%, at least about 
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75%, at least about 80%, 81%, 82%, 83%, 84%, more typically at least about 85%, 86%, 87%, 
88%, 89%, and more typically at least about 90%, 91%, 92%, 93%, 94%, and even more 
typically at least about 95%, 96%, 97%, 98%, 99%, sequence identity to a polynucleotide recited, 
above. 

5 Included within the scope of the nucleic acid sequences of the invention are nucleic acid 

sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, or complements thereof, which 
fragment is greater than about 5 nucleotides, preferably 7 nucleotides, more preferably greater , 
than 9 nucleotides and most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 

10 20 nucleotides or more that are selective for (i.e. specifically hybridize to any one of the 

polynucleotides of the invention) are contemplated. Probes capable of specifically hybridizing to 
a polynucleotide can differentiate polynucleotide sequences of the invention from other 
polynucleotide sequences in the same family of genes or can differentiate human genes from 
genes of other species, and are preferably based on unique nucleotide sequences. 

1 5 The sequences falling within the scope of the present invention are not limited to these 

specific sequences, but also include allelic and species variations thereof. Allelic and species 
variations can be routinely determined by comparing the sequence provided SEQ ID NO: 1 -984, 
1969-2952, 3937-3942 or 3949-3954, a representative fragment thereof, or a nucleotide sequence at 
least 90% identical, preferably 95% identical, to SEQ ED NO: 1-984, 1 969-2952, 3937-3942 or 

20 3949-3954 with a sequence from another isolate of the same species. Furthermore, to accommodate 
codon variability, the invention includes nucleic acid molecules coding for the same amino acid 
sequences as do the specific ORFs disclosed herein. In other words, in the coding region of an 
ORF, substitution of one codon for another codon that encodes the same amino acid is expressly 
contemplated. 

25 The nearest neighbor or homology result for the nucleic acids of the present invention, 

including SEQ ID NO: 1-984, 1969-2952, 3937-3942or 3949-3954, can be obtainedby searching a 
database using an algorithm or a program. Preferably, a BLAST which stands for Basic Local 
Alignment Search Tool is used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 
36 290-300 (1993) and Altschul Si 7 , et al. J. Mol. Biol. 21 :403-410 (1990)). Alternatively a 

3 0 F ASTA version 3 search against Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also 
provided by the present invention. Species homologs may be isolated and identified by making 
suitable probes or primers from the sequences provided herein and screening a suitable nucleic 
acid source from the desired species. 
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The invention also encompasses allelic variants of the disclosed polynucleotides or 
proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which also 
encode proteins which are identical, homologous or related to that encoded by the 
polynucleotides. 

5 The nucleic acid sequences of the invention are further directed to sequences which 

encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 
sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 

10 encoding the amino acid sequence variants are preferably constructed by mutating the 

polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic 
acid alterations can be made at sites that differ in the nucleic acids from different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 
will typically be modified in series, e.g., by substituting first with conservative choices {e.g., 

1 5 hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant 

choices (e.g., hydrophobic amino acid to a charged amino acid), and then deletions or insertions 
may be made at the target site. . Amino acid sequence deletions generally range from about 1 to 
30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid 
insertions include amino- and/or carboxyl-terminal fusions ranging in length from one to one 

20 hundred or more residues, as well as intrasequence insertions of single or multiple amino acid 
residues. Intrasequence insertions may range generally from about 1 to 10 amino residues, 
preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal 
sequences necessary for secretion or for intracellular targeting in different host cells and 
sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein. 

25 In a preferred method, polynucleotides encoding the novel amino acid sequences are 

changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the 
site of being changed. In general, the techniques of site-directed mutagenesis are well known to 

30 those of skill in the art and this technique is exemplified by publications such as, Edelman et al., 
DNA 2:183 (1983). A versatile and efficient method for producing site-specific changes in a 
polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 
(1982). PCR may also be used to create amino acid sequence variants of the novel nucleic acids. 
When small amounts of template DNA are used as starting material, primer(s) that differs 

35 slightly in sequence from the corresponding region in the template DNA can generate the desired 
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amino acid variant. PCR amplification results in a population of product DNA fragments that 
differ from the polynucleotide template encoding the polypeptide at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 
gives a polynucleotide encoding the desired amino acid variant 
5 A further technique for generating amino acid variants is the cassette mutagenesis 

technique described in Wells et al., Gene 34:315 (1985); and other mutagenesis techniques well 
known in the art, such as, for example, the techniques in Sambrook et al., supra, and Current 
Protocols in Molecular Biology, Ausubel et al. Due to the inherent degeneracy of the genetic 
code, other DNA sequences which encode substantially the same or a functionally equivalent 

10 amino acid sequence may be used in the practice of the invention for the cloning and expression 
of these novel nucleic acids. Such DNA sequences include those which are capable of 
hybridizing to the appropriate novel nucleic acid sequence under stringent conditions. 

Polynucleotides encoding preferred polypeptide truncations of the invention can be used 
to generate polynucleotides encoding chimeric or fusion proteins comprising one or more 

1 5 domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of the 
polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or 
synthetic) or RNA. Methods and algorithms for obtaining such polynucleotides are well known 
to those of skill in the art and can include, for example, methods for detemiining hybridization 

20 conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature - 
protein coding sequences corresponding to any one of SEQ ID NO: 1-984, 1969-2952, 3937- 
3942 or 3949-3954, or functional equivalents thereof, may be used to generate recombinant 
DNA molecules that direct the expression of that nucleic acid, or a functional equivalent thereof, 

25 in appropriate host cells. Also included are the cDNA inserts of any of the clones identified 
herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 
nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et al. 
(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful 

30 nucleotide sequences for joining to polynucleotides include an assortment of vectors, e.g., 

plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 
art. Accordingly, the invention also provides a vector including a polynucleotide of the 
invention and a host cell containing the polynucleotide. In general, the vector contains an origin 
of replication functional in at least one organism, convenient restriction endonuclease sites, and a 

35 selectable marker for the host cell. Vectors according to the invention include expression 
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vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell 
according to the invention can be a prokaryotic or eukaryotic cell and can be a unicellular 
organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic acid 
5 having any of the nucleotide sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949- 
3954or a fragment thereof or any other polynucleotides of the invention. In one embodiment, the 
recombinant constructs of the present invention comprise a vector, such as a plasmid or viral 
vector, into which a nucleic acid having any of the nucleotide sequences of SEQ ID NO: 1-984, 
1969-2952, 3937-3942 or 3949-3954 or a fragment thereof is inserted, in a forward or reverse 

10 orientation. In the case of a vector comprising one of the ORFs of the present invention, the 
vector may further comprise regulatory sequences, including for example, a promoter, operably 
linked to the ORF. Large numbers of suitable vectors and promoters are known to those of skill 
in the art and are commercially available for generating the recombinant constructs of the present 
invention. The following vectors are provided by way of example. Bacterial: pBs, phagescript, 

1 5 PsiXl 74, pBluescript SK, pBs KS, pNH8a, pNHl 6a, pNHl 8a, pNH46a (Stratagene); pTrc99A, 
pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia). Eukaryotic: pWLneo, pSV2cat, pOG44, 
PXTI, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL (Pharmacia). 

The isolated polynucleotide of the invention may be operably linked to an expression 
control sequence such as the pMT2 or pED expression vectors disclosed in Kaufman et al., 

20 Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein recombinantly. Many 
suitable expression control sequences are known in the art. General methods of expressing 
recombinant proteins are also known and are exemplified in R. Kaufman, Methods in 
Enzymology 185, 537-566 (1990). As defined herein "operably linked" means that the isolated 
polynucleotide of the invention and an expression control sequence are situated within a vector 

25 or cell in such a way that the protein is expressed by a host cell which has been transformed 
(transfected) with the ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 
transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, 

30 lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV thymidine 
kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I. Selection of 
the appropriate vector and promoter is well within the level of ordinary skill in the art. 
Generally, recombinant expression vectors will include origins of replication and selectable 
markers permitting transformation of the host cell, e.g., the ampicillin resistance gene of E. coli 

35 and S. cerevisiae TRP1 gene, and a promoter derived from a highly-expressed gene to direct 
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transcription of a downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3 -phosphogly cerate kinase (PGK), a-factor, acid 
phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
5 preferably, a leader sequence capable of directing secretion of translated protein into the 

periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 
characteristics, e.g., stabilization or simplified purification of expressed recombinant product. 
Useful expression vectors for bacterial use are constructed by inserting a structural DNA 

10 sequence encoding a desired protein together with suitable translation initiation and termination 
signals in operable reading phase with a functional promoter. The vector will comprise one or 
more phenotypic selectable markers and an origin of replication to ensure maintenance of the 
vector and to, if desirable, provide amplification within.the host. Suitable prokaryotic hosts for 
transformation include E. coli, Bacillus subtilis, Salmonella typhimurium and various species 

15 within the genera Pseudomonas, Streptomyces, and Staphylococcus, although others may also be 
employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use 
can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 

20 (ATCC 37017). Such commercial vectors include, for example, pKK223-3 (Pharmacia Fine 
Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
pBR322 "backbone" sections are combined with an appropriate promoter and the structural 
sequence to be expressed. Following transformation of a suitable host strain and growth of the 
host strain to an appropriate cell density, the selected promoter is induced or derepressed by 

25 appropriate means (e.g., temperature shift or chemical induction) and cells are cultured for an 
additional period. Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification. 

Polynucleotides of the invention can also be used to induce immune responses. For 
example, as described in Fan et al, Nat. Biotech. 17:870-872 (1999), incorporated herein by 

30 reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and preferably intramuscular injection of the DNA. The nucleic acid 
sequences are preferably inserted in a recombinant expression vector and may be in the form of 
naked DNA. 
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43 ANTISENSE . 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules that 
are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949^3954, or fragments, analogs or 
5 derivatives thereof. An "antisense" nucleic acid comprises a nucleotide sequence that is 

complementary to a "sense" nucleic acid encoding a protein, e.g., complementary to the coding 
strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. In 
specific aspects, antisense nucleic acid molecules are provided that comprise a sequence 
complementary to at least about 10, 25, 50, 100, 250 or 500 nucleotides or an entire coding 

10 strand, or to only a portion thereof. Nucleic acid molecules encoding fragments, homologs, 

derivatives and analogs of a protein of any of SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 
3955-3960 or antisense nucleic acids complementary to a nucleic acid sequence of SEQ ED NO: 
1-984, 1969-2952, 3937-3942 or 3949-3954 are additionally provided. 

In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" 

15 of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers 
to the region of the nucleotide sequence comprising codons which are translated into amino acid 
residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
"noncoding region" of the coding strand of a nucleotide sequence of the invention. The term 
"noncoding region" refers to 5' and 3 1 sequences which flank the coding region that are not 

20 translated into amino acids (i.e., also referred to as 5' and 3' untranslated regions). 

Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g., SEQ ID 
NO: 1-984, 1969-2952, 3*937-3942 or 3949-3954), antisense nucleic acids of the invention can be 
designed according to the rules of Watson and Crick or Hoogsteen base pairing. The antisense 
nucleic acid molecule can be complementary to the entire coding region of a mRNA, but more 

25 preferably is an oligonucleotide that is antisense to only a portion of the coding or noncoding 
region of a mRNA. For example, the antisense oligonucleotide can be complementary to the 
region surrounding the translation start site of a mRNA. An antisense oligonucleotide can be, for 
example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic 
acid of the invention can be constructed using chemical synthesis or enzymatic ligation reactions 

30 using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense 
oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or 
variously modified nucleotides designed to increase the biological stability of the molecules or to 
increase the physical stability of the duplex formed between the antisense and sense nucleic 
acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. 
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Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 
4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl- 
2-thiouridine, 5^arboxymethylaminomethyluracil 3 dihydrouracil, beta-D-galactosylqueosine, 
5 inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 
2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 
7-methylguanine, 5 -methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 
beta-D-mannosylqueosine, 5-methoxycarboxymethyluracil, 5-methoxyuracil, 

2- methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
10 queosine, 2-thiocytosine, 5 -methy 1-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, 

uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 

3- (3-amino-3-N-2-carboxypropyl) iiracil, (acp3)w ? and 2,6-diaminopurine. Alternatively, the 
antisense nucleic acid can be produced biologically using an expression vector into which a 
nucleic acid has been subcloned in an antisense orientation (Le. 9 RNA transcribed from the 

15 inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, 
described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 
subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 
genomic DNA encoding a protein according to the invention to thereby inhibit expression of the 

20 protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by 

conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of 
an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in 
the major groove of the double helix. An example of a route of administration of antisense 
nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 

25 antisense nucleic acid molecules can be modified to target selected cells and then administered 
systemically. For example, for systemic administration, antisense molecules can be modified 
such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 
receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using 

30 the vectors described herein. To achieve sufficient intracellular concentrations of antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under the 
control of a strong pol II or pol in promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 
a-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specific 

35 double-stranded hybrids with complementary RNA in which, contrary to the usual p-units, the 
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strands run parallel to each other (Gaultier et aL (1987) Nucleic Acids Res 15: 6625-6641). The 
antisense nucleic acid molecule can also comprise a 2'-o-methylribonucleotide (Inoue et aL 
(1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue et aL (1987) 
FEBS Lett 215: 327-330). 

5 

4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a 
-single-stranded nucleic acid, such as a mRNA, to which they have a complementary region. 

10 Thus, ribozymes (e.g., hammerhead ribozymes (described in HaselhofF and Gerlach (1988) 

Nature 334:585-591)) can be used to catalytically cleave a mRNA transcripts to thereby inhibit 
translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
designed based upon the nucleotide sequence of a DNA disclosed herein (i.e. , SEQ ID NO: 1- 
984, 1969-2952, 3937-3942 or 3949-3954). For example, a derivative of a Tetrahymena L-19 

1 5 IVS RNA can be constructed in which the nucleotide sequence of the active site is 

complementary to the nucleotide sequence to be cleaved in a SECX-encoding mRNA. See, e.g., 
Cech et al. U.S. Pat. No. 4,987*071; and Cech et aL U.S. Pat No. 5,1 16,742. Alternatively, 
SECX mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from 
a pool of RNA molecules. See, e.g., Bartel et aL, (1993) Science 261:1411-1418. 

20 Alternatively, gene expression can be inhibited by targeting nucleotide sequences 

complementary to the regulatory region (e.g., promoter and/or enhancers) to form triple helical 
structures that prevent transcription of the gene in target cells. See generally, Helene. (1991) 
Anticancer Drug Des. 6: 569-84; Helene. et aL (1 992) Ann, N Y. Acad. Sci. 660:27-36; and 
Maher (1992) Bioassays 14: 807-15. 

25 In various embodiments, the nucleic acids of the invention can be modified at the base 

moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or 
solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et aL (1996) Bioorg Med 
Chem 4: 5-23). As used herein, the terms "peptide nucleic acids" or "PNAs" refer to nucleic acid 

30 mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 
backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 
standard solid phase peptide synthesis protocols as described in Hyrup et aL (1996) above; 

35 Perry-O'Keefe et aL (1996) PNAS 93: 14670-675. 
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PNAs of the invention can be used in therapeutic and diagnostic applications. For 

example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of 

gene expression by, e.g. , inducing transcription or translation arrest or inhibiting replication. 

PNAs of the invention can also be used, e.g. , in the analysis of single base pair mutations in a 

gene by, e.g., PNA directed PCR clamping; as artificial restriction enzymes when used in 

combination with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); or as probes or 

primers for DNA sequence and hybridization (Hyrup et al (1996), above; Peny-O'Keefe (1996), 

above). 

In another embodiment, PNAs of the invention can be modified, e.g. , to enhance their 
stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA-DNA chimeras can be generated that may 
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 
en2ymes, e.g., RNase H and DNA polymerases, to interact with the DNA portion while the PNA 
portion would provide high binding affinity and specificity. PNA-DNA chimeras can be linked 
using linkers of appropriate lengths selected in terms of base stacking, number of bonds between 
the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-DNA chimeras 
can be performed as described in Hyrup (1996) above and Finn et al (1996) Nucl Acids Res 24: 
3357-63. For example, a DNA chain can be synthesized on a solid support using standard 
phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 
5'-(4-methoxytrityI)amino-5 -deoxy-thymidine phosphoramidite, can be used between the PNA 
and the 5' end of DNA (Mag et al (1989) Nucl Acid Res 17: 5973-88). PNA monomers are then 
coupled in a stepwise manner to produce a chimeric molecule with a 5' PNA segment and a 3' 
DNA segment (Finn et al (1996) above). Alternatively, chimeric molecules can be synthesized 
with a 5' DNA segment and a 3' PNA segment. See, Petersen et al (1975) Bioorg Med Chem 
Lett5: 1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such as 
peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the 
cell membrane (see, e.g., Letsinger et al, 1989, Pfoc. Natl Acad Sci U.S.A. 86:6553-6556; 
Lemaitre et al, 1987, Proc. Natl Acad Sci 84:648-652; PCT Publication No. W088/098 10) or 
the blood-brain barrier (see, e.g., PCT Publication No. W089/10134). In addition, 
oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g., Krol et 
al, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, Pharm. Res. 
5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a 
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peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered 
cleavage agent, etc. 



4.5 HOSTS 

5 The present invention further provides host cells genetically engineered to contain the 

polynucleotides of the invention. . For example, such host cells may contain nucleic acids of the 
invention introduced into the host cell using known transformation, transfection or infection 
methods. The present invention still further provides host cells genetically engineered to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 

1 0 with a regulatory sequence heterologous to the host cell which drives expression of the 
polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by homologous 
recombination) to provide increased polypeptide expression by replacing, in whole or in part, the 

1 5 naturally occurring promoter with all or part of a heterologous promoter so that the cells express 
the polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it 
is operatively linked to the encoding sequences. See, for example, PCT International Publication 
No. WO94/12650, PCT International Publication No. WO92/20808, and PCT International 
Publication No. WO91/09955. It is also contemplated that, in addition to heterologous promoter 

20 DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 

encodes carbamyl phosphate synthase, aspartate transcarbamylase, and.dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the coding 
sequence, amplification of the marker DNA by standard selection methods results in co- 
amplification of the desired protein coding sequences in the cells. 

25 The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 

eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 
calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation (Davis, 
L. et al., Basic Methods in Molecular Biology (1 986)); The host cells containing one of the 

30 polynucleotides of the invention, can be used in conventional manners to produce the gene 
product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a 
heterologous protein under the control of the EMF. 

Any host/vector system can be used to express one or more of the ORFs of the present 
invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, Cv-1 cell, 

35 COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. coli and B. subtilis. 
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The most preferred cells are those which do not normally express the particular polypeptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 
be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters. Cell-free translation systems can also be employed to produce such proteins using 
5 RNAs derived from the DNA constructs of the present invention. Appropriate cloning and 
expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, et 
al., in Molecular Cloning; A Laboratory Manual, Second Edition, Cold Spring Harbor, New 
York (1989), the disclosure of which is hereby incorporated by reference. , 

Various mammalian cell culture systems can also be employed to express recombinant 

10 protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines capable of expressing a 
compatible vector are, for example, the CI 27, monkey COS cells, Chinese Hamster Ovary 
(CHO) cells, human kidney 293 cells, human epidermal A431 cells, human Colo205 cells, 3T3 
cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived 

15 from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, 

HL-60, U937, HaK or Jurkat cells. Mammalian expression vectors will comprise an origin of 
replication, a suitable promoter and also any necessary ribosome binding sites, polyadenylation 
site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for example, 

20 S V40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced 
in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or 
more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protein 
refolding steps can be used, as necessary, in completing configuration of the mature protein. 

25 Finally, high performance liquid chromatography (HPLC) can be employed for final purification 
steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 
agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast 
30 or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 

Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida, or 
any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial 
strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it 
35 may be necessary to modify the protein produced therein, for example by phosphorylation or 
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glycosylation of the appropriate sites, in order to obtain the functional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
5 inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 
may be replaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a different 
gene or a novel regulatory sequence synthesized by genetic engineering methods. Such 
regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, 

10 negative regulatory elements, transcriptional initiation sites, regulatory protein binding sites or 
combinations of said sequences. Alternatively, sequences which affect the structure or stability 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylation signals, mRNA stability elements, splice 
sites, leader sequences for enhancing or modifying transport or secretion properties of the 

15 protein, or other sequences which alter or improve the function or stability of protein or RNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the * 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 

20 of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. 
\ Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or different cell-type specificity than 
the naturally occurring elements. Here, the naturally occurring sequences are deleted and new 
sequences are added. In all cases, the identification of the targeting event may be facilitated by 

25 the use of one or more selectable marker genes that are contiguous with the targeting DNA, 

allowing for the selection of cells in which the exogenous DNA has integrated into the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 
more marker genes exhibiting the property of negative selection, such that the negatively 
selectable marker is linked to the exogenous DNA, but configured such that the negatively 

30 selectable marker flanks the targeting sequence, and such that a correct homologous 

recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 
phosphoribosyl-transferase (gpt) gene. 



27 



WO 01/57190 PCTYUS01/04098 
The gene targeting or gene activation techniques which can be used in accordance with 

this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 

Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; International Application No. 

PCT/US92/09627 (WO93/09222) by Selden et al.; and International Application No. 

5 PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is incorporated by reference 

herein in its entirety. 

4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 

10 comprising: the amino acid sequences set forth as any one of SEQ ID NO: 985-1968, 2953-3936, 
3943-3948 or 3955-3960 or an amino acid sequence encoded by any one of the nucleotide 
sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 or the corresponding full 
length or mature protein. Polypeptides of the invention also include polypeptides preferably with 
biological or immunological activity that are encoded by: (a) a polynucleotide having any one of 

15 the nucleotide sequences set forth in SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 or 
(b) polynucleotides encoding any one of the amino acid sequences set forth as SEQ ID NO: 985- 
1968, 2953-3936, 3943-3948 or 3955-3960 or (c) polynucleotides that hybridize to the 
complement of the polynucleotides of either (a) or (b) under stringent hybridization conditions. 
The invention also provides biologically active or immunologically active variants of any of the 

20 amino acid sequences set forth as SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 3955-3960 
or the corresponding full length or mature protein; and "substantial equivalents" thereof (e.g., at 
least about 65%, at least about 70%, at least about 75%, at least about 80%, 81%, 82%, 83%, 
84%, more typically at least about 85%, 86%, 87%, 88%, 89%, and more typically at least about 
90%, 91%, 92%, 93%, 94%, and even more typically at least about 95%, 96%, 97%, 98%, 99%, 

25 sequence identity that retain biological activity. Polypeptides encoded by allelic variants may 
have a similar, increased, or decreased activity compared to polypeptides comprising SEQ ID 
NO: 985-1968, 2953-3936, 3943-3948 or 3955-3960. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 

30 be in linear form or they may be cyclized using known methods, for example, as described in H. 
U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and in R. S. McDowell, et al., J. Amer. 
Chem. Soc. 1 14, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as immunoglobulins for many purposes, 
including increasing the valency of protein binding sites. 
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The present invention also provides both full-length and mature forms (for example, 

without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 

sequence is identified in the sequence listing by translation of the disclosed nucleotide 

sequences. The mature form of such protein may be obtained by expression of a full-length 

5 polynucleotide in a suitable mammalian cell or other host cell. The sequence of the mature form 

of the protein is also determinable from the amino acid sequence of the full-length form. Where 

proteins of the present invention are membrane bound, soluble forms of the proteins are also 

provided. In such forms, part or all of the regions causing the proteins to be membrane bound 

are deleted so that the proteins are fully secreted from the cell in which they are expressed. 

10 Protein compositions of the present invention may further comprise an acceptable carrier, 

such as a hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic acid 
fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 

15 nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide sequence but, due to 
the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic 
acid fragments of the present invention are the ORFs that encode proteins. 

A variety of methodologies known in the art can be utilized to obtain any one of the 
isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid 

20 sequence can be synthesized using commercially available peptide synthesizers. The 

synthetically-constructed protein sequences, by virtue of sharing primary, secondary or tertiary 
structural and/or conformational characteristics with proteins may possess biological properties 
in common therewith, including protein activity. This technique is particularly useful in 
producing small peptides and fragments of larger polypeptides. Fragments are useful, for 

25 example, in generating antibodies against the native polypeptide. Thus, they may be employed 
as biologically active or immunological substitutes for natural, .purified proteins in screening of 
therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified from 
cells which have been altered to express the desired polypeptide or protein. As used herein, a 

30 cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or 
which the cell normally produces at a lower level. One skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides 

35 or proteins of the present invention. 
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The invention also relates to methods for producing a polypeptide comprising growing a 
culture of host cells of the invention in a suitable culture medium, and purifying the protein from 
the cells or the culture in which the cells are grown. For example, the methods of the invention 
include a process for producing a polypeptide in which a host cell containing a suitable 
5 expression vector that includes a polynucleotide of the invention is cultured under conditions that 
allow expression of the encoded polypeptide. The polypeptide can be recovered from the 
culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
further purified. Preferred embodiments include those in which the protein produced by such 
process is a full length or mature form of the protein. 

10 In an alternative method, the polypeptide or protein is purified from bacterial cells which 

naturally produce the polypeptide or protein. One skilled in the art can readily follow known 
methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to," 
immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, 

15 and immuno-afflnity chromatography. See, e.g., Scopes, Protein Purification: Principles and 
Practice, Springer- Verlag (1994); Sambrook, et al., in Molecular Cloning: A Laboratory 
Manual; Ausubel et al., Current Protocols in Molecular Biology. Polypeptide fragments that 
retain biological/immunological activity include fragments comprising greater than about 100 
amino acids, or greater than about 200 amino acids, and fragments that encode specific protein 

20 domains. 

The purified polypeptides can be used in in vitro binding assays which are well known in 
the art to identify molecules which bind to the polypeptides. These molecules include but are not 
limited to, for e.g., small molecules, molecules from combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 

25 activity in in vivo tissue culture or animal models that are well known in the art. In brief, the 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 
cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be complexed with toxins, e.g., ricin or cholera, or with other compounds that are toxic to 

30 cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 

specificity of the binding molecule for SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 3955- 
3960. 

The protein of the invention may also be expressed as a product of transgenic animals, 
e,g 3 as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized 
35 by somatic or germ cells containing a nucleotide sequence encoding the protein. 
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The proteins provided herein also include proteins characterized by amino acid sequences 
similar to those of purified proteins but into which modification are naturally provided or 
deliberately engineered. For example, modifications, in the peptide or DNA sequence, can be 
made by those skilled in the art using known techniques- Modifications of interest in the protein 
5 sequences may include the alteration, substitution, replacement, insertion or deletion of a 
selected amino acid residue in the coding sequence. For example, one or more of the cysteine 
residues may be deleted or replaced with another amino acid to alter the conformation of the 
molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 
well known to those skilled in the art (see, e.g., U.S. Pat. No. 4,518,584). Preferably, such 

1 0 alteration, substitution, replacement, insertion or deletion retains the desired activity of the 

protein. Regions of the protein that are important for the protein function can be determined by 
various methods known in the art including the alanine-scanning method which involved 
systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine-containing variant for biological activity. This type of analysis determines the 

15 importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein function may be determined by the eMATRIX program. 

Other fragments and derivatives of the sequences of proteins which would be expected to 
retain protein activity in whole or in part and are useful for screening or other immunological 
methodologies may also be easily made by those skilled in the art given the disclosures herein. 

20 Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect expression vectors, and employing 
an insect expression system. Materials and methods for baculovirus/insect cell expression 
systems are commercially available in kit form from, e.g., Invitrogen, San Diego, Calif., U.S.A. 

25 (the MaxBat™ kit), and such methods are well known in the art, as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), incorporated herein by 
reference. As used herein, an insect cell capable of expressing a polynucleotide of the present 
invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host cells under 

30 culture conditions suitable to express the recombinant protein. The resulting expressed protein 
may then be purified from such culture (Le. t from culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 
of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such affinity resins as concanavalin A-agarose, 

35 heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving 
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hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl 
ether; or immunoaffinity chromatography. 

Alternatively, the protein of the invention may also be expressed in a form which will 
facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
5 maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, NJ.) and Invitrogen, 
respectively. The protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG®") is commercially 
10 available from Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant methyl or other 
aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 
1 5 homogeneous isolated recombinant protein. The protein thus purified is substantially free of 

other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 

The polypeptides of the invention include analogs (variants). This embraces fragments, 
as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. - 

20 Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides or 

modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 
another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs 
may exhibit improved properties such as activity and/or stability. Examples of moieties which 
may be fused to the polypeptide or an analog include, for example, targeting moieties which 

25 provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 

antibodies to immune cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well 
as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be. 
fused to the polypeptide include therapeutic agents which are used for treatment, for example, 
immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 

30 steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as 
alpha or beta interferon. 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY 
AND SIMILARITY 
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Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in computer 
programs including, but are not limited to, the GCG program package, including GAP 
(Devereux, J., et al, Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 
5 University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Altschul, S.F. 
et al., J. Molec. Biol. 215:403-410 (1990), PSU3LAST (Altschul S.F. et aL, Nucleic Acids Res. 
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et al., J. Comp. 
Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif software (Nevill- 
Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 

10 (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by 
reference) and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 
1 05-3 1 (1 982), incorporated herein by reference). The BLAST programs are publicly available 
from the National Center for Biotechnology Information (NCBI) and other sources (BLAST 
Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol. 

15 Biol. 215:403-410 (1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 

protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 
another polypeptide. Within a fusion protein the polypeptide according to the invention can 

20 correspond to all or a portion of a protein according to the invention. In one embodiment, a 
fusion protein comprises at least one biologically active portion of a protein according to the 
invention. In another embodiment, a fusion protein comprises at least two biologically active 
portions of a protein according to the invention. Within the fusion protein, the term "operatively 
linked" is intended to indicate that the polypeptide according to the invention and the other 

25 polypeptide are fused in-frame to each other. The polypeptide can be fused to the N-terminus or 
C -terminus. 

For example, in one embodiment a fusion protein comprises a polypeptide according to . 
the invention operably linked to the extracellular domain of a second protein. 
In another embodiment, the fusion protein is a GST-fusion protein in which the polypeptide 
30 sequences of the invention are fused to the C-terminus of the GST (i.e., glutathione 
S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in which 
the polypeptide sequences according to the invention comprise one or more domains fused to 
sequences derived from a member of the immunoglobulin protein family. The immunoglobulin 
35 fusion proteins of the invention can be incorporated into pharmaceutical compositions and 
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administered to a subject to inhibit an interaction between a ligand and a protein of the invention 
on the surface of a cell, to thereby suppress signal transduction in vivo. The immunoglobulin 
fusion proteins can be used to affect the bioavailability of a cognate ligand. Inhibition of the 
ligand/protein interaction may be usefiil therapeutically for both the treatment of proliferative 
5 and differentiative disorders, e,g., cancer as well as modulating (e.g., promoting or inhibiting) 
cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be used as 
immunogens to produce antibodies in a subject, to purify ligands, and in screening assays to 
identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand. 
A chimeric or fusion protein of the invention can be produced by standard recombinant 

10 DNA techniques. For example, DNA fragments coding for the different polypeptide sequences 
are ligated together in-frame in accordance with conventional techniques, e.g., by employing 
blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for 
appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to 
avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can 

15 be synthesized by conventional techniques including automated DNA synthesizers. 

Alternatively, PCR amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can * 
subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
example, Ausubel et al. (eds.) Current Protocols in Molecular Biology, John Wiley & 

20 Sons, 1 992). Moreover, many expression vectors are commercially available that already encode 
a fusion moiety (e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned into such an expression vector such that the fusion moiety is linked 
in-frame to the protein of the invention. 

25 4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 
activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the invention. Delivery of a functional gene encoding polypeptides of the invention to 

30 appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for example, 
Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998). For additional reviews of 
gene therapy technology see Friedmann, Science, 244: 1275-1281 (1989); Verma, Scientific 

35 American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of 
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the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosomal substrates (transient expression) or 
artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
5 activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

Alternatively, it is contemplated that in other human disease states, preventing the expression of 
or inhibiting the activity of polypeptides of the invention will be useful in treating the disease 
states. It is contemplated that antisense therapy or gene therapy could be applied to negatively 
regulate the expression of polypeptides of the invention. 

1 0 Other methods inhibiting expression of a protein include the introduction of antisense 

molecules to the nucleic acids of the present invention, their complements, or their translated RNA 
sequences, by methods known in the art. Further, the polypeptides of the present invention can be 
inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such 
as a silencer, which is tissue specific. 

1 5 The present invention still further provides cells genetically engineered in vivo to express the 

polynucleotides of the invention, wherein such polynucleotides are in operative association with a 
regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in 
the cell. These methods can be used to increase or decrease the expression of the polynucleotides of 
the present invention. 

20 Knowledge of DNA sequences provided by the invention allows for modification of cells to 

. permit, increase, or decrease, expression of endogenous polypeptide. Cells can be modified (e.g., by 
homologous recombination) to provide increased polypeptide expression by replacing, in whole or 
in part, the naturally occurring promoter with all or part of a heterologous promoter so that the cells 
express the protein at higher levels. The heterologous promoter is inserted in such a manner that it is 

25 operatively linked to'the desired protein encoding sequences. See, for example, PCT International 
Publication No. WO 94/1 2650, PCT International PublicationNo. WO 92/20808, and PCT 
InternationalPubUcationNo. WO 91/09955. It is also contemplated that, in addition to heterologous 
promoter DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 

3 0 intron DNA may be inserted along with the heterologous promoter DNA. If linked to the desired 
protein coding sequence, amplification of the marker DNA by standard selection methods results in 
co-amplification of the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 

3 5 inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may 
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be replaced by homologous recombination. As described herein, gene targeting can be used to 
replace a gene's existing regulatory region with a regulatory sequence isolated from a different gene" 
or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enhancers, scaffold-attachmentregions, negative 
5 regulatory elements, transcriptional initiation sites, regulatory protein binding sites or combinations 
of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or 
protein produced may be replaced, removed, added, or otherwise modified by targeting. These 
sequences include polyadenylation signals, mRNA stability elements, splice sites, leader sequences 
for enhancing or modifying transport or secretion properties of the protein, or other sequences 

1 0 which alter or improve the function or stability of protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the gene 
under the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both 
upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regulatory element. Alternatively, the 

1 5 targeting event may replace an existing element; for example, a tissue-specific enhancer can be 
replaced by an enhancer that has broader or different cell-type specificity than the naturally 
occurring elements. Here, the naturally occurring sequences are deleted and new sequences are 
added. In all cases, the identification of the targeting event may be facilitated by the use of one or 
more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection 

20 of cells in which the exogenous DNA has integrated into the cell genome. The identification of the 
targeting event may also be facilitated by the use of one or more marker genes exhibiting the 
property of negative selection, such that the negatively selectable marker is linked to the exogenous 
DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and 
such that a correct homologous recombination event with sequences in the host cell genome does 

25 not result in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial 
xanthine-guanine phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to Chappel; 

30 U.S. Patent No. 5,578,461 to Sherwin et al.; International Application No. PCT/US92/09627 
(WO93/09222)by Seldenet al.; and International ApphcationNo. PCT/US90/06436 
(W09 1/06667) by Skoultchi et al., each of which is incorporated by reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 
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In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 
inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
5 control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 
prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 

10 processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of a promoter of the 

1 5 polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 
replacing the homologous promoter to provide for increased protein expression. The homologous 
promoter can be supplemented by insertion of one or more heterologous enhancer elements 

20 known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to express 
polypeptides of the invention or that express a variant polypeptide. Such animals are useful as 
models for studying the in vivo activities of polypeptide as well as for studying modulators of the 

25 polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 
inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 

30 control of exogenous or endogenous promoter elements, are known as transgenic animals. 

Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 
prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 

35 processes, and preferably in disease states. Transgenic animals are useful as model systems to 
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identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
Publication No. W094/28 122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of the polynucleotides of the 
5 invention promoter is either activated or inactivated to alter the level of expression of the 

polypeptides of the invention. Inactivation can be carried out using homologous recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 
homologous promoter to provide for increased protein expression. The homologous promoter 
can be supplemented by insertion of one or more heterologous enhancer elements known to 
10 confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

The polynucleotides and proteins of the present invention are expected to exhibit one or 
more of the uses or biological activities (including those associated with assays cited herein) 

15 identified herein. Uses or activities described for proteins of the present invention may be 

provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 
mechanism underlying the particular condition or pathology will dictate whether the 
polypeptides of the invention, the polynucleotides of the invention or modulators (activators or 

20 inhibitors) thereof would be beneficial to the subject in need of treatment. Thus, "therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(including recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
polypeptides of the invention (including full length protein, mature protein and truncations or 
domains thereof), or compounds and other substances that modulate the overall activity of the 

25 target gene products, either at the level of target gene/protein expression or target protein 

activity. Such modulators include polypeptides, analogs, (variants), including fragments and 
fusion proteins, antibodies and other binding proteins; chemical compounds that directly or 
indirectly activate or inhibit the polypeptides of the invention (identified, e.g., via drug screening 
assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 

30 helix formation; and in particular antibodies or other binding partners that specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular activation 
or in one of the other physiological pathways described herein. 

35 4.10.1 RESEARCH USES AND UTILITIES 
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The polynucleotides provided by the present invention can be used by the research 
community for various purposes. The polynucleotides can be used to express recombinant 
protein for analysis, characterization or therapeutic use; as markers for tissues in which the 
corresponding protein is preferentially expressed (either constitutively or at a particular stage of 
5 tissue differentiation or development or in disease states); as molecular weight markers on gels; 
as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 
disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of 
information to derive PCR primers for genetic fingerprinting; as a probe to "subtract-out" known 

10 sequences in the process of discovering other novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immunization techniques; and as 
an antigen to raise anti-DNA antibodies or elicit another immune response. Where the 
polynucleotide encodes a protein which binds or potentially binds to another protein (such as, for 

1 5 example, in a receptor-ligand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et al., Cell 75:791-803 (1993)) to identify 
polynucleotides encoding the other protein with which binding occurs or to identify inhibitors of 
the binding interaction. 

The polypeptides provided by the present invention can similarly be used in assays to 

20 determine biological activity, including in a panel of multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of the protein (or its 
receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or 

25 development or in a disease state); and, of course, to isolate correlative receptors or ligands. 
Proteins involved in these binding interactions can also be used to screen for peptide or small 
molecule inhibitors or agonists of the binding interaction. 

Any or all of these research utilities are capable of being developed into reagent grade or 
kit format for commercialization as research products. 

30 Methods for performing the uses listed above are well known to those skilled in the art. 

References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed. ? Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. Fritsch 
and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular Cloning 
Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 
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4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as nutritional 
sources or supplements. Such uses include without limitation use as a protein or amino acid 
supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
5 such cases the polypeptide or polynucleotide of the invention can be added to the feed of a 

particular organism or can be administered as a separate solid or liquid preparation, such as in the 
form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, the 
polypeptide or polynucleotide of the invention can be added to the medium in or on which the 
microorganism is cultured. 

10 

4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

A polypeptide of the present invention may exhibit activity relating to cytokine, cell 
proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) 

1 5 activity or may induce production of other cytokines in certain cell populations. A 

polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Many 
protein factors discovered to date, including all known cytokines, have exhibited activity in one 
or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient 
confirmation of cytokine activity. The activity of therapeutic compositions of the present 

20 invention is evidenced by any one of a number of routine factor dependent cell proliferation 
assays for cell lines including, without limitation, 32D, DA2, DA1G, T10, B9, B9/1 1, BaF3, 
MC9/G, M+(preB M+), 2E8 5 RB5, DAI, 123, Tl 165, HT2, CTLL2, TF-1, Mo7e, CMK, 
HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following: 
Assays for T-cell or thymocyte proliferation include without limitation those described 

25 in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Bertagnolli et al., J. Immunol. 
145:1706-1712, 1990; Bertagnolli et al., Cellular Immunology 133:327-341, 1991; Bertagnolli, 

30 et al., I. Immunol. 149:3778-3783, 1992; Bowman et al., I. Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or 
thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 
Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse 
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and human interleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 
include, without limitation, those described in: Measurement of Human and Murine Interleukin 2 
5 and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current Protocols in 

Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; 
deVries et al., J. Exp. Med. 173:1205-1211, 1991; Moreau et al., Nature 336:690-692, 1988; 
Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 1983; Measurement of mouse 
and human interleukin 6-Nordan, R. In Current Protocols in Immunology. J. E. Coligan eds. Vol 

10 1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith et al., Proc. Natl. Aced. Sci. 

U.S.A. 83:1857-1861, 1986; Measurement of human Interleukin 1 1 --Bennett, F. ? Giannotti, J., 
Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 
6.15.1 John Wiley and Sons, Toronto. 1991; Measurement of mouse and human Interleukin 
9--Ciarletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. 

15 J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, proteins 
that affect APC-T cell interactions as well as direct T-cell effects by measuring proliferation and 
cytokine production) include, without limitation, those described in: Current Protocols in 
Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W Strober, 

20 Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse 
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7, 
Immunologic studies in Humans); Weinberger et al., Proc. Natl. Acad. Sci. USA 77:6091-6095, 
1980; Weinberger et al., Eur. J. Immun. 1 1:405-411, 1981; Takai et al., J. Immunol. 
137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988. 

25 

4.10.4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity and 
be involved, in the proliferation, differentiation and survival of pluripotent and totipotent stem 
cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells and/or 

30 germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or 
ex vivo is expected to maintain and expand cell populations in a totipotential or pluripotential 
state which would be useful for re-engineering damaged or diseased tissues, transplantation, 
manufacture of bio-pharmaceuticals and the development of bio-sensors. The ability to produce 
large quantities of human cells has important working applications for the production of human 

35 proteins which currently must be obtained from non-human sources or donors, implantation of 
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cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; 

tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 

cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 

for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung. 

5 It is contemplated that multiple different exogenous growth factors and/or cytokines may 

be administered in combination with the polypeptide of the invention to achieve the desired 

effect, including any of the growth factors listed herein, other stem ceU maintenance factors, and 

specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), Flt-3 ligand (Fit- 

3L), any of the interleukins, recombinant soluble IL-6 receptor fused to IL-6, macrophage 

1 0 inflammatory protein 1 -alpha (MP- 1 -alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic fibroblast 
growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion of 
these cells in culture will facilitate the production of large quantities of mature cells. Techniques 

1 5 for culturing stem cells are known in the art and administration of polypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 
the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected 
with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 

20 layer for the stem cell populations in culture or in vivo. Stromal support cells for feeder layers 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to induce 
autocrine expression of the polypeptide of the invention. This will allow for generation of 

25 undifferentiated totipotential/pluripotential stem cell lines that are useful as is or that can then be 
differentiated into the desired mature cell types. These stable cell lines can also serve as a source 
of undifferentiated totipotential/pluripotential mRNA to create cDNA libraries and templates for 
polymerase chain reaction experiments. These studies would allow for the isolation and 
identification of differentially expressed genes in stem cell populations that regulate stem cell 

30 proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present invention 
may be used to manipulate stem cells in culture to give rise to neuroepithelial cells that can be 
used to augment or replace cells damaged by illness, autoimmune disease, accidental damage or 

35 genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation 
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of neural cells and for the regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, 
the expanded stem cell populations can also be genetically altered for gene therapy purposes and 
5 to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 
types. A broadly applicable method of obtaining pure populations of a specific differentiated 
cell type from undifferentiated stem cell populations involves the use of a cell-type specific 

10 promoter driving a selectable marker. The selectable marker allows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus 
et al., Differentiation, 48: 173-182, (1991); Klug et al., J. Clin. Invest., 98(1): 216-224, (1998)) 
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza et al., 
Academic Press (1997)). Alternatively, directed differentiation of stem cells can be 

15 accomplished by culturing the stem cells in the presence of a differentiation factor such as 
retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the invention 
exhibits stem cell growth factor activity. Stem cells are isolated from any one of various cell 

20 sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder 
layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention alone or in combination with other growth 
factors or cytokines. The ability of the polypeptide of the invention to induce stem cells 
proliferation is determined by colony formation on semi-solid support e.g. as described by 

25 Bernstein etal., Blood, 77: 2316-2321 (1991). 



4.10.5 HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 
and, consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal 

30 biological activity in support of colony forming cells or of factor-dependent cell lines indicates 
involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
erythroid progenitor cells alone or in combination with other cytokines, thereby indicating utility, 
for example, in treating various anemias or for use in conjunction with irradiation/chemotherapy 
to stimulate the production of erythroid precursors and/or erythroid cells; in supporting the 

35 growth and proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e., 
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traditional CSF activity) useful, for example, in conjunction with chemotherapy to prevent or 

treat consequent myelo-suppression; in supporting the growth and proliferation of 

megakaryocytes and consequently of platelets thereby allowing prevention or treatment of 

various platelet disorders such as thrombocytopenia, and generally for use in place of or 

5 complimentary to platelet transfusions; and/or in supporting the growth and proliferation of 

hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned 

hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as 

those usually treated with transplantation, including, without limitation, aplastic anemia and 

paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment 

1 0 post irradiation/chemotherapy, either in-vivo or ex-vivo (i.e., in conjunction with bone marrow 

transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) 

as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 

Suitable assays for proliferation and differentiation of various hematopoietic lines are 

15 cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et al., Molecular 
and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81:2903-2915, 1993. 

20 Assays for stem cell survival and differentiation (which will identify, among others, 

proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methylcellulose colony forming assays, Freshney, M. G. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; Hirayama et al., 
Proc. Natl. Acad. Sci. USA 89:5907-591 1, 1992; Primitive hematopoietic colony forming cells 

25 with high proliferative potential, McNiece, I. K. and Briddell, R. A. In Culture of Hematopoietic 
Cells. R. I. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N Y. 1994; Neben et 
al., Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell assay, 
Ploemacher, R. E. In Culture of Hematopoietic Cells. R. L Freshney, et al. eds. Vol pp. 1-21, 
Wiley-Liss, Inc., New York, N.Y. 1994; Long term bone marrow cultures in the presence of 

30 stromal cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. L 

Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term culture 
initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. Freshney, et al. 
eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 

35 4.10.6 TISSUE GROWTH ACTIVITY 



44 



WO 01/57190 PCTYUS01/04098 
A polypeptide of the present invention also may be involved in bone, cartilage, tendon, 
ligament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue 
repair and replacement, and in healing of burns, incisions and ulcers. 

A polypeptide of the present invention which induces cartilage and/or bone growth in 
5 circumstances where bone is not normally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 
prophylactic use in closed as well as open fracture reduction and also in the improved fixation of 
artificial joints. De novo bone formation induced by an osteogenic agent contributes to the repair 
10 of congenital, trauma induced, or oncologic resection induced craniofacial defects, and also is 
useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming cells, 
stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 
bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
15 periodontal disease, such as through stimulation of bone and/or cartilage repair or by blocking 
inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes may also be possible using the composition of the 
invention. 

Another category of tissue regeneration activity that may involve the polypeptide of the 

20 present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue or 

other tissue formation in circumstances where such tissue is not normally formed, has application 
in the healing of tendon or ligament tears, deformities and other tendon or ligament defects in 
humans and other animals. Such a preparation employing a tendon/ligament-like tissue inducing 
protein may have prophylactic use in preventing damage to tendon or ligament tissue, as well as 

25 use in the improved fixation of tendon or ligament to bone or other tissues, and in repairing 

defects to tendon or ligament tissue. De novo tendon/ligament-like tissue formation induced by 
a composition of the present invention contributes to the repair of congenital, trauma induced, or 
other tendon or ligament defects of other origin, and is also useful in cosmetic plastic surgery for 
attachment or repair of tendons or ligaments. The compositions of the present invention may 

30 provide environment to attract tendon- or ligament-forming cells, stimulate growth of tendon- or 
ligament-forming cells, induce differentiation of progenitors of tendon- or ligament-forming 
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for return in vivo to effect 
tissue repair. The compositions of the invention may also be useful in the treatment of tendinitis, 
carpal tunnel syndrome and other tendon or ligament defects. The compositions may also include 

35 an appropriate matrix and/or sequestering agent as a carrier as is well known in the art. 
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The compositions of the present invention may also be useful for proliferation of neural 

cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral 

nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which 

involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a 

5 composition may be used in the treatment of diseases of the peripheral nervous system, such as 

peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 

system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 

lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 

accordance with the present invention include mechanical and traumatic disorders, such as spinal 

10 cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 

resulting from chemotherapy or other medical therapies may also be treatable using a 

composition of the invention. 

Compositions of the invention may also be useful to promote better or faster closure of 
non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 
1 5 insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 
kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 
endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 
20 desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue 
to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity. 

A composition of the present invention may also be useful for gut protection or . 
regeneration and treatment of lung or liver fibrosis,- reperfusion injury in various tissues, and 
conditions resulting from systemic cytokine damage. 
25 A composition of the present invention may also be useful for promoting or inhibiting 

differentiation of tissues described above from precursor tissues or cells; or for inhibiting the 
growth of tissues described above. 

Therapeutic compositions of the invention can be used in the following: 

Assays for tissue generation activity include, without limitation, those described in: 
30 International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International Patent 
Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 
WO91/07491 (skin, endothelium). 

Assays for wound healing activity include, without limitation, those described in: Winter, 
Epidermal Wound Healing, pps. 71-112 (Maibach, H. I. and Rovee, D. T., eds.), Year Book 
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Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. Dermatol 
71:382-84(1978). 



4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 
5 A polypeptide of the present invention may also exhibit immune stimulating or immune 

suppressing activity, including without limitation the activities for which assays are described 
herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities. A 
protein may be useful in the treatment of various immune deficiencies and disorders (including 
severe combined immunodeficiency (SOD)), e.g., in regulating (up or down) growth and 

10 proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NK cells 
and other cell populations. These immune deficiencies may be genetic or be caused by viral (e.g., 
HIV) as well as bacterial or fungal infections, or may result from autoimmune disorders. More 
specifically, infectious diseases causes by viral, bacterial, fungal or other infection may be 
treatable using a protein of the present invention, including infections by HIV, hepatitis viruses, 

1 5 herpes viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be useful 
where a boost to the immune system generally may be desirable, i.e., in the treatment of cancer. 

Autoimmune disorders which may be treated using a protein of the present invention 
include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 

20 rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre syndrome, 

autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft-versus-host 
disease and autoimmune inflammatory eye disease. Such a protein (or antagonists thereof, 
including antibodies) of the present invention may also to be useful in the treatment of allergic 
reactions and conditions (e.g., anaphylaxis, serum sickness, drug reactions, food allergies, insect 

25 venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, 
angioedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme, 
Stevens-Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma 
(particularly allergic asthma) or other respiratory problems. Other conditions, in which immune 

30. suppression is desired (including, for example, organ transplantation), may also be treatable 
using a protein (or antagonists thereof) of the present invention. The therapeutic effects of the 
polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals 
models such as the cumulative contact enhancement test (Lastbom et al., Toxicology 125: 59-66, 
1998), skin prick test (Hoffmann et al., Allergy 54: 446-54, 1999), guinea pig skin sensitization 
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test (Vohr et al., Arch. Toxocol. 73: 501-9), and murine local lymph node assay (Kiraber et aL, 
J. Toxicol. Environ. Health 53 : 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an 
5 immune response already in progress or may involve preventing the induction of an immune 
response. The functions of activated T cells may be inhibited by suppressing T cell responses or 
by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 
generally an active, non-antigen-specific, process which requires continuous exposure of the T 
cells to the suppressive agent. Tolerance, which involves inducing non-responsiveness or anergy 

10 in T cells, is distinguishable from immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 
demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence 
of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 

15 limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T cells, will be useful in situations of tissue, skin and 
organ transplantation and in graft-versus-host disease (GVHD). For example, blockage of T cell 
function should result in reduced tissue destruction in tissue transplantation. Typically, in tissue 
transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, 

20 followed by ah immune reaction that destroys the transplant. The administration of a therapeutic 
composition of the invention may prevent cytokine synthesis by immune cells, such as T cells, 
and thus acts as an immunosuppressant. Moreover, a lack of costimulation may also be sufficient 
to anergize the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 

25 of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the function of a combination of B lymphocyte antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in 

30 rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et 
al., Science 257:789-792 (1992) and Turka et aL, Proc. Natl. Acad. Sci USA, 89:11102-11105 
(1992). In addition, murine models of GVHD (see Paul ed., Fundamental Immunology, Raven 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 

35 compositions of the invention on the development of that disease. 
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Blocking antigen function may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 
reactive against self tissue and which promote the production of cytokines and autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 
5 reduce or eliminate disease symptoms. Administration of reagents which block stimulation of T 
cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
cell-derived cytokines which may be involved in the disease process. Additionally, blocking 
reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to 
. long-term relief from the disease. The efficacy of blocking reagents in preventing or alleviating 
1 0 autoimmune disorders can be determined using a number of well-characterized animal models of 
human autoimmune diseases. Examples include murine experimental autoimmune encephalitis, 
systemic lupus erythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, murine autoimmune 
collagen arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental 
myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 
15 840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a means 
of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
responses may be in the form of enhancing an existing immune response or eliciting an initial 
immune response. For example, enhancing an immune response may be useful in cases of viral 

20 infection, including systemic viral diseases such as influenza, the common cold, and encephalitis. 
Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 
APCs either expressing a peptide of the present invention or together with a stimulatory form of 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 

25 patient. Another method of enhancing ahti- viral immune responses would be to isolate infected 
cells from a patient, transfect them with a nucleic acid encoding a protein of the present 
invention as described herein such that the cells express all or a portion of the protein on their 
surface, and reintroduce the transfected cells into the patient. The infected cells would now be 
capable of delivering a costimulatory signal to, and thereby activate, T cells in vivo. 

30 A polypeptide of the present invention may provide the necessary stimulation signal to T 

cells to induce a T cell mediated immune response against the transfected tumor cells. In 
addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to 
reexpress sufficient mounts of MHC class I or MHC class II molecules, can be transfected with 
nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) of an 

35 MHC class I alpha chain protein and p2 microglobulin protein or an MHC class II alpha chain 
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protein and an MHC class II beta chain protein to thereby express MHC class I or MHC class II 

proteins on the cell surface. Expression of the appropriate class I or class II MHC in conjunction 

with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T 

cell me'diated immune response against the transfected tumor cell. Optionally, a gene encoding 

an antisense construct which blocks expression of an MHC class H associated protein, such as 

the invariant chain, can also be cotransfected with a DNA encoding a peptide having the activity 

of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 

tumor specific immunity. Thus, the induction of a T cell mediated immune response in a human 

subject may be sufficient to overcome tumor-specific tolerance in the subject. 

The activity of a protein of the invention may, among other means, be measured by the 

following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. 
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et al. 5 Proc. Natl. Acad. Sci. USA 
78:2488-2492, 1981; Herrmann et aL, J. Immunol. 128:1968-1974, 1982; Handa et at, J. 
Immunol. 135:1564-1572, 1985; Takai et aL, I. Immunol. 137:3494-3500, 1986; Takai et aL, J. 
Immunol. 140:508-512, 1988; Bowman et aL, J. Virology 61:1992-1998; Bertagnolli et aL, 
Cellular Immunology 133:327-341, 1991; Brown et aL, J. Immunol. 153:3079-3092, 1994. 

Assays for T-cell-dependent immunoglobulin responses and isotype switching (which 
will identify, among others, proteins that modulate T-cell dependent antibody responses and that 
affect Thl/Th2 profiles) include, without limitation, those described in: Maliszewski, J. 
Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro antibody production, 
Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 
pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins 
that generate predominantly Thl and CTL responses) include, without limitation, those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et aL, J. Immunol. 137:3494-3500, 1986; Takai et aL, J. Immunol. 140:508-512, 
1988; Bertagnolli et aL, J. Immunol. 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins expressed by 
dendritic ceils that activate naive T-cells) include, without limitation, those described in: Guery 
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et al., J. Immunol. 134:536-544, 1995; Inaba et al., Journal of Experimental Medicine 
173:549-559, 1991; Macatonia et al., Journal of Immunology 154:5071-5079, 1995; Porgador et 
al., Journal of Experimental Medicine 182:255-260, 1995; Nair et al., Journal of Virology 
67:4062-4069, 1993; Huang et al., Science 264:961-965, 1994; Macatonia et al., Journal of 
5 t Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al., Journal of Clinical Investigation 
94:797-807, 1994; and Inaba et al., Journal of Experimental Medicine 172:631-640, 1990. 

Assays for lymphocyte survival/apoptosis (which will identify, among others, proteins 
that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: Darzynkiewicz et al., Cytometry 
10 13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993; Gorczyca et al., Cancer Research 
53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991; Zacharchuk, Journal of Immunology 
145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993; Gorczyca et al., International 
- Journal of Oncology 1:639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 
1 5 include, without limitation, those described in: Antica et al., Blood 84: 1 1 1 -1 1 7, 1 994; Fine et al., 
Cellular Immunology 155:1 1 1-122, 1994; Galy et al., Blood 85:2770-2778, 1995; Toki et al., 
Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 

4.10.8 ACTIVIN/INHIBIN ACTIVITY 

20 A polypeptide of the present invention may also exhibit activin- or inhibin-related 

activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the 
release of follicle stimulating hormone (FSH). Thus, a polypeptide of the present invention, 

25 * alone or in heterodimers with a member of the inhibin family, may be useful as a contraceptive 
based on the ability of inhibins to decrease fertility in female mammals and decrease 
spermatogenesis in male mammals. Administration of sufficient amounts of other inhibins can 
induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a 
homodimer or as a heterodimer with other protein subunits of the inhibin group, may be useful as 

30 a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from cells of the anterior pituitary. See, for example, U.S. Pat. No. 4,798,885. A 
polypeptide of the invention may also be useful for advancement of the onset of fertility in 
sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 
animals such as, but not limited to, cows, sheep and pigs. 
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The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: Vale et 
al., Endocrinology 91 :562-572, 1972; Ling et al, Nature 321 :779-782, 1986; Vale et al., Nature 
5 321:776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et al., Proc. Natl. Acad. Sci. 
USA 83:3091-3095, 1986. 

4.10.9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemotactic or chemokinetic 

10 activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, 
T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 
receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or 

15 modulators of the invention) provide particular advantages in treatment of wounds and other 
trauma to tissues, as well as in treatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved 
immune responses against the tumor or infecting agent 

A protein or peptide has chemotactic activity for a particular cell population if it can 

20 stimulate, directly or indirectly, the directed orientation or movement of such cell population. 
Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 
Whether a particular protein has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used in the following: 

25 Assays for chemotactic activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells 
across a membrane as well as the ability of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those, described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 

30 M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates 
and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 
6.12.1-6.12.28; Taub et al. J. Clin. Invest. 95:1370-1376, 1995; Lind et al. APMIS 103:140-146, 
1995; Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 
1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994. 

35 
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4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be useful in treatment of various coagulation disorders (including 
5 hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting formation of thromboses and for 
treatment and prevention of conditions resulting therefrom (such as, for example, infarction of 
cardiac and central nervous system vessels (e.g., stroke). 
1 0 Therapeutic compositions of the invention can be used in the following: 

Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al, Thrombosis Res. 
45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474, 1988. 

15 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

Polypeptides of the invention may be involved in cancer cell generation, proliferation or 
metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. For 

20 example, the presence or increased expression of a polynucleotide/polypeptide of the invention 
may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy. 
Conversely, a defect in the gene or absence of the polypeptide may be associated with a cancer 
condition. Identification of single nucleotide polymorphisms associated with cancer or a 
predisposition to cancer may also be useful for diagnosis or prognosis. 

25 Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 

inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor growth) 
and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. Therapeutic 
compositions of the invention may be effective in adult and pediatric oncology including in solid 
phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic 

30 cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, 
acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell 
cancers, breast cancers including small cell carcinoma and ductal carcinoma, gastrointestinal 
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 

35 associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers including 
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bladder cancer and prostate cancer, malignancies of the female genital tract including ovarian 

carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 

kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, 

neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 

5 nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, 

tumor progression of human skin keratinocytes, squamous cell carcinoma, basal cell carcinoma, 

hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 

inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 

10 administered to treat cancer. Therapeutic compositions can be administered in therapeutically 
effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 
chemotherapy, radiotherapy, thermotherapy, and laser therapy, and may provide a beneficial 
effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clinical condition, without necessarily eradicating the cancer. 

1 5 The composition can also be administered in therapeutically effective amounts as a 

portion of an anti-cancer cocktail. An anti -cancer cocktail is a mixture of the polypeptide or 
modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutically 
acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. 
Anti-cancer drugs that are well known in the art and can be used as a treatment in combination 

20 with the polypeptide or modulator of the invention include: Actinomycin D, Aminogiutethimide, 
Asparaginase, Bleomycin, Busulfan, Carboplatin, Carmustine, Chlorambucil, Cisplatin (cis- 
DDP), Cyclophosphamide, Cytarabine HC1 (Cytosine arabinoside), Dacarbazine, Dactinomycin, 
Daunorubicin HC1, Doxorubicin HC1, Estramustine phosphate sodium, Etoposide (VI 6-213), 
Floxuridine, 5-Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, 

25 Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomustine, Mechlorethamine HC1 (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, 
Methotrexate (MTX), Mitomycin, Mitoxantrone HC1, Octreotide, Plicamycin, Procarbazine HC1, 
Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, Mitoguazone, Pentostatin, 

30 Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for prophylactic 
treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therapeutically 

35 effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 
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In vitro models can be used to determine the effective doses of the polypeptide of the 

invention as a potential cancer treatment. These in vitro models include proliferation assays of 

cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1987) Culture of 

Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 18 and Ch 21), 

5 tumor systems in nude mice as described in Giovanella et ah, J. Natl. Can. Inst., 52: 921-30 

(1974), mobility and invasive potential of tumor cells in Boy den Chamber assays as described in 

Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction 

of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial 

cell migration as described in Ribatta et al, Intl. J. Dev. Biol., 40: 1 1 89-97 (1999) and Li et al., 

10 Clin. Exp. Metastasis, 17:423-9 (1999), respectively. Suitable tumor cells lines are available, 

e.g. from American Type Tissue Culture Collection catalogs. 

4.10.12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
1 5 receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of the 
invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors 
and ligands include, without limitation, cytokine receptors and their ligands, receptor kinases and 
their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell interactions 
and their ligands (including without limitation, cellular adhesion molecules (such as selectins, 
20 integrins and their ligands) and receptor/ligand pairs involved in antigen presentation, antigen 
recognition and development of cellular and humoral immune responses. Receptors and ligands 
are also useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
25 interactions. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods: 

Suitable assays for receptor-ligand activity include without limitation those described in: 
Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. 
30 Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- Interscience (Chapter 7.28, 
Measurement of Cellular Adhesion under static conditions 7.28.1- 7.28.22), Takai et al., Proa- 
Natl. Acad. Sci. USA 84:6864-6868, 1987; Bierer et al, J. Exp. Med. 168:1145-1156, 1988; 
Rosenstein et al., J. Exp. Med. 169:149-160 1989; Stoltenborg et al., J. Immunol. Methods 
175:59-68, 1994; Stitt et al., Cell 80:661-670, 1995. 
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By way of example, the polypeptides of the invention may be used as a receptor for a 
ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 
through binding assays, affinity chromatography, dihybrid screening assays, BIAcore assays, gel 
overlay assays, or other methods known in the art. 
5 Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 

partial antagonist require the use of other proteins as competing ligands. The polypeptides of the 
present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 
colorimetric molecules or a toxin molecules by conventional methods. ("Guide to Protein 
Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 (1990) Academic 
10 Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
carbon- 14 . Examples of colorimetric molecules include, but are not limited to, fluorescent 
molecules such as fluorescamine, or rhodamine or other colorimetric molecules. Examples of 
toxins include, but are not limited, to ricin. 



15 4.10.13 DRUG SCREENING 

This invention is particularly useful for screening chemical compounds by using the 
novel polypeptides or binding fragments thereof in any of a variety of drug screening techniques. 
The polypeptides or fragments employed in such a test may either be free in solution, affixed to a 
solid support, borne on a cell surface or located intracellularly. One method of drug screening 

20 utilizes eukaryotic or prokaryotic host cells which are stably transformed with recombinant 

nucleic acids expressing the polypeptide or a fragment thereof. Drugs are screened against such 
transformed cells in competitive binding assays. Such cells, either in viable or fixed form, can 
be used for standard binding assays. One may measure, for example, the formation of complexes 
between polypeptides of the invention or fragments and the agent being tested or examine the 

25 diminution in complex formation between the novel polypeptides and an appropriate cell line, 
which are well known in the art. 

Sources for test compounds that may be screened for ability to bind to or modulate (i.e., 
increase or decrease) the activity of polypeptides of the invention include (1) inorganic and 
organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 

30 comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads" via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria and 

35 fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 



56 



WO 01/57190 PCT7US01/04098 
screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosomal peptides, and (non-naturally occurring) variants thereof. For a 
review, see Science 252:63-68 (1998). 
5 Combinatorial libraries are composed of large numbers of peptides, oligonucleotides or 

organic compounds and can be readily prepared by traditional automated synthesis methods, 
PCR, cloning or proprietary synthetic methods. Of particular interest are peptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide libraries. 

10 For a review of combinatorial chemistry and libraries created therefrom, see Myers, Curr. Opin. 
Biotechnol 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see 
Al-Obeidi et al., Mol Biotechnol 9(3):205-23 (1998); Hruby et al., Curr Opin Chem Biol, 
1(1):1 14-19 (1997); Donier et al., Bioorg Med Chem, 4(5):709-15 (1996) (alkylated dipeptides). 
Identification of modulators through use of the various libraries described herein permits 

1 5 modification of the candidate "hit" (or "lead") to optimize the capacity of the "hit" to bind a 
polypeptide of the invention. The molecules identified in the binding assay are then tested for 
antagonist or agonist activity in in vivo tissue culture or anim al models that are well known in the 
art. In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested 
for either cell/animal death or prolonged survival of the animal/cells. 

20 The binding molecules thus identified may be complexed with toxins, e.g., ricin or 

cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding 
molecule complex is then targeted to a tumor or other cell by the specificity of the binding 
molecule for a polypeptide of the invention. Alternatively, the binding molecules may be 
complexed with imaging agents for targeting and imaging purposes. 

25 

4.10.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For example, 

30 expression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used 
to identify polynucleotides encoding binding partners. As another example, affinity 
chromatography with the appropriate immobilized polypeptide of the invention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 
of different libraries used for the identification of compounds, and in particular small molecules, 

35 that modulate (i. e., increase or decrease) biological activity of a polypeptide of the invention. 
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Ligands for receptor polypeptides of the invention can also be identified by adding exogenous 

ligands, or cocktails of ligands to two cells populations that are genetically identical except for 

the expression of the receptor of the invention: one cell population expresses the receptor of the 

invention whereas the other does not. The response of the two cell populations to the addition of 

5 ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the 

polypeptide of the invention in cells and assayed for an autocrine response to identify potential 

ligand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known 

in the art can be used to identify binding partner polypeptides, including, (1) organic and 

inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 

10 comprised of random peptides, oligonucleotides or organic molecules. 

The role of downstream intracellular signaling molecules in the signaling cascade of the 
polypeptide of the invention can be determined. For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 
protein, whose ligand has been identified, is produced in a host cell. The cell is then incubated 

1 5 with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 
the chimeric receptor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications i.e. phosphorylation. Other methods known to those in the 
art can also be used to identify signaling molecules involved in receptor activity. 

20 4.10.15 ANTI-INFLAMMATORY ACTIVITY 

Compositions of the present invention may also exhibit anti-inflammatory activity. The 
anti-inflammatory activity may be achieved by providing a stimulus to cells involved in the 
inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, 
cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the inflammatory 

25 process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production 
of other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be used to treat inflammatory conditions including chronic or acute 
conditions), including without limitation intimation associated with infection (such as septic 
shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfusion injury, 

30 endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induced lung injury, inflammatory bowel disease, Crohn ! s disease or resulting from 
over production of cytokines such as TNF or IL-1 . Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 
Compositions of this invention may be utilized to prevent or treat conditions such as, but not 

35 limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 
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arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, 
graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary 
disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 
intrauterine infections. 

4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblasts, promyelocytic, 
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
Fishman et aL, 1985, Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia). 

4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 
therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient (including 
human and non-human mammalian patients) according to the invention include but are not 
limited to the following lesions of either the central (including spinal cord, brain) or peripheral 
nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions which sever a portion of the nervous system, or compression 
injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 
infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or injured 
as a result of infection, for example, by an abscess or associated with infection by human 
immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme disease, 
tuberculosis, syphilis; 
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(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 

injured as a result of a degenerative process including but not limited to degeneration associated 

with Parkinsons disease, Alzheimer's disease, Huntington r s chorea, or amyotrophic lateral 

sclerosis; 

5 (v) lesions associated with nutritional diseases or disorders, in which a portion of the 

nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B 12 deficiency, folic acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus 
callosum), and alcoholic cerebellar degeneration; 
10 (vi) neurological lesions associated with systemic diseases including but not limited to 

diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neurotoxins; and 

1 5 (viii) demyelinated lesions in which a portion of the nervous system is destroyed or 

injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a nervous 

20 system disorder may be selected by testing for biological activity in promoting the survival or 
differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit 
any of the following effects may be useful according to the invention: 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

25 (iii) increased production of a neuron-associated molecule in culture or in vivo, e.g. , 

choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 
(iv) decreased symptoms of neuron dysfunction in vivo. 
Such effects may be measured by any method known in the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method set 

30 forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515); increased sprouting of neurons may 
be detected by methods set forth in Pestronk et al. (1980, Exp. NeuroL 70:65-82) or Brown et al. 
(1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may 
be measured by bioassay, enzymatic assay, antibody binding, Northern blot assay, etc., 
depending on the molecule to be measured; and motor neuron dysfunction may be measured by 



60 



WO 01/57190 PCTYUS01/04098 
assessing the physical manifestation of motor neuron disorder, e.g., weakness, motor neuron 

conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 

invention include but are not limited to disorders such as infarction, infection, exposure to toxin, 

trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as 

well as other components of the nervous system, as well as disorders that selectively affect 

neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal 

muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile 

muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), 

poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropathy 

(Charcot-Marie-Tooth Disease). 

4.10.18 OTHER ACTIVITIES 

A polypeptide of the invention may also exhibit one or more of the following additional 
activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, 
including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing 
or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 
(such as, for example, breast augmentation or diminution, change in bone form or shape); 
effecting biorhythms or circadian cycles or rhythms; effecting the fertility of male or female 
subjects; effecting the metabolism, catabolism, anabolism, processing, utilization, storage or 
elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other 
nutritional factors or components); effecting behavioral characteristics, including, without 
limitation^ appetite, libido, stress, cognition (including cognitive disorders), depression 
(including depressive disorders) and violent behaviors; providing analgesic effects or other pain 
reducing effects; promoting differentiation and growth of embryonic stem cells in lineages other 
than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting 
deficiencies of the enzyme and treating deficiency-related diseases; treatment of 
hyperproliferative disorders (such as, for example, psoriasis); immunoglobulin-like activity (such 
as, for example, the ability to bind antigens or complement); and the ability to act as an antigen 
in a vaccine composition to raise an immune response against such protein or another material or 
entity which is cross-reactive with such protein. 

4.10.19 IDENTIFICATION OF POLYMORPHISMS 
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The demonstration of polymorphisms makes possible the identification of such 
polymorphisms in human subjects and the pharmacogenetic use of this information for diagnosis 
and treatment. Such polymorphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
5 response) or a differential response to drug administration, and this genetic information can be 
used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a 
polymorphism associated with a predisposition to inflammation or autoimmune disease makes 
possible the diagnosis of this condition in humans by identifying the presence of the 
polymorphism. 

1 0 Polymorphisms can be identified in a variety of ways known in the art which all 

generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally 
involving isolation or amplification of the DNA, and identifying the presence of the 
polymorphism in the DNA. For example, PCR may be used to amplify an appropriate fragment 
of genomic DNA which may then be sequenced. Alternatively, the DNA may be subjected to 

1 5 allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 

hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 
single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 
adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction fragment length polymorphism analysis (using restriction 

20 enzymes that provide differential digestion of the genomic DNA depending on the presence or 
absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the 
present invention can be used to detect polymorphisms. The array can comprise modified 
nucleotide sequences of the present invention in order to detect the nucleotide sequences of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 

25 invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence could 
also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g., 
by an antibody specific to the variant sequence, 
i- 

30 4.10.20 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against rheumatoid 
arthritis is determined in an experimental animal model system. The experimental model system 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at., 1983, 
Science, 219:56, or by B. Waksman et al, 1963, Int. Arch. Allergy Appl. Immunol., 23:129. 
35 Induction of the disease can be caused by a single injection, generally intradermally, of a 
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suspension of killed Mycobacterium tuberculosis in complete Freund's adjuvant (CPA). The 
route of injection can vary, but rats may be injected at the base of the tail with an adjuvant 
mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 
1-5 mg/kg. The control consists of administering PBS only. 
5 The procedure for testing the effects of the test compound would consist of intradennally 

injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the 
test compound and subsequent treatment every other day until day 24. At 14, 15, 18, 20, 22, and 
24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as 
described by J. Holoskitz above. An analysis of the data would reveal that the test compound 
10 would have a dramatic affect on the swelling of the joints as measured by a decrease of the 
arthritis score. 

4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and antibodies or 
1 5 other binding partners or modulators including antisense polynucleotides) of the invention have 
numerous applications in a variety of therapeutic methods. Examples of therapeutic applications 
include, but are not limited to, those exemplified herein. 

4.11.1 EXAMPLE 

20 One embodiment of the invention is the administration of an effective amount of the 

polypeptides or other composition of the invention to individuals affected by a disease or 
disorder that can be modulated by regulating the peptides of the invention. While the mode of 
administration is not particularly important, parenteral administration is preferred. An 
exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 

25 polypeptides or other composition of the invention will normally be determined by the 

prescribing physician. It is to be expected that the dosage will vary according to the age, weight, 
condition and response of the individual patient. Typically, the amount of polypeptide 
administered per dose will be in the range of about 0.01|ig/kg to 100 mg/kg of body weight, with 
the preferred dose being about O.l^g/kg to 10 mg/kg of patient body weight. For parenteral 

30 administration, polypeptides of the invention will be formulated in an injectable form combined 
with a pharmaceutically acceptable parenteral vehicle. Such vehicles are well known in the art 
and examples include water, saline, Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albumin. The vehicle may contain minor amounts of 
additives that maintain the isotonicity and stability of the polypeptide or other active ingredient. 

3 5 The preparation of such solutions is within the skill of the art. 
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4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source derived, 
5 including without limitation from recombinant and non-recombinant sources and including 
antibodies and other binding partners of the polypeptides of the invention) may be administered 
to a patient in need, by itself, or in pharmaceutical compositions where it is mixed with suitable 
carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 
may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 

10 fillers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art. The term 
"pharmaceutically acceptable" means a non-toxic material that does not interfere with the 
effectiveness of the biological activity of the active ingredient(s). The characteristics of the 
carrier will depend on the route of administration. The pharmaceutical composition of the 
invention may also contain cytokines, lymphokines, or other hematopoietic factors such as 

15 M-CSF, GM-CSF, TNF, IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-11, IL-12, 
IL-13, IL-1^, IL-15, IFN, TNFO, TNF1, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell 
factor, and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 
include various growth factors such as epidermal growth factor (EGF), platelet-derived growth 

20 factor (PDGF), transforming growth factors (TGF-a and TGF-p), insulin-like growth factor 
(IGF), as well as cytokines described herein. 

The pharmaceutical composition may further contain other agents which either enhance 
the activity of the protein or other active ingredient or complement its activity or use in 
treatment. Such additional factors and/or agents may be included in the pharmaceutical 

25 composition to produce a synergistic effect with protein or other active ingredient of the 
invention, or to minimize side effects. Conversely, protein or other active ingredient of the 
present invention may be included in formulations of the particular clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or antithrombotic factor, or anti- 
inflammatory agent to minimize side effects of the clotting factor, cytokine, lymphokine, other 

30 hematopoietic factor, thrombolytic or antithrombotic factor, or ariti-inflammatory agent (such as 
IL-IRa, IL-1 Hyl, IL-1 Hy2, anti-TNF, corticosteroids, immunosuppressive agents). A protein 
of the present invention may be active in multimers (e.g., heterodimers or homodimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 
invention may comprise a protein of the invention in such multimeric or complexed form. 
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As an alternative to being included in a pharmaceutical composition of the invention 
including a first protein, a second protein or a therapeutic agent may be concurrently 
administered with the first protein (e.g., at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 
5 Techniques for formulation and administration of the compounds of the instant application may 
be found in "Remingtons Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, latest 
edition. A therapeutically effective dose further refers to that amount of the compound sufficient 
to result in amelioration of symptoms, e.g., treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of treatment, healing, prevention or 
10 amelioration of such conditions. When applied to an individual active ingredient, administered 
alone, a therapeutically effective dose refers to that ingredient alone. When applied to a 
combination, a therapeutically effective dose refers to confined amounts of the active 
ingredients that result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

15 In practicing the method of treatment or use of the present invention, a therapeutically 

effective amount of protein or other active ingredient of the present invention is administered to 
a mammal having a condition to be treated. Protein or other activeingredient of the present 
invention may be administered in accordance with the method of the invention either alone or in 
combination with other therapies such as treatments employing cytokines, lymphokines or other 

20 hematopoietic factors. When co- administered with one or more cytokines, lymphokines or other 
hematopoietic factors, protein or other active ingredient of the present invention may be 
administered either simultaneously with the cytokine(s), lymphokme(s), other hematopoietic 
factor(s), thrombolytic or anti-thrombotic factors, or sequentially. If administered sequentially, 
the attending physician will decide on the appropriate sequence of administering protein or other 

25 active ingredient of the present invention in combination with cytokine(s), lymphokine(s), other 
hematopoietic factors), thrombolytic or anti-thrombotic factors. 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, transmucosal, or 
30 intestinal administration; parenteral delivery, including intramuscular, subcutaneous, 
intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, 
intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of the present invention used in the pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 
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ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 
or intravenous injection. Intravenous administration to the patient is preferred. 

Alternately, one may administer the compound in a local rather than systemic manner, for 
example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often in 
5 a depot or sustained release formulation. In order to prevent the scarring process frequently 
occurring as complication of glaucoma surgery, the compounds may be administered topically, 
for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 
system, for example, in a liposome coated with a specific antibody, targeting, for example, 
arthritic or fibrotic tissue. The liposomes will, be targeted to and taken up selectively by the 
10 afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an effective 
dosage to the desired site of action. The determination of a suitable route of administration and 
an effective dosage for a particular indication is within the level of skill in the art. Preferably for 
wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage 
15 ranges for the polypeptides of the invention can be extrapolated from these dosages or from 
similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the 
clinician to provide maximal therapeutic benefit. 

4.12.2 COMPOSITIONS/FORMULATIONS 

20 Pharmaceutical compositions for use in accordance with the present invention thus may 

be formulated in a conventional manner using one or more physiologically acceptable carriers 
comprising excipients and auxiliaries which facilitate processing of the active compounds into 
preparations which can be used pharmaceutically. These pharmaceutical compositions may be 
manufactured in a manner that is itself known, e.g. , by means of conventional mixing, 

25 dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 

lyophilizing processes. Proper formulation is dependent upon the route of administration chosen. 
When a therapeutically effective amount of protein or other active ingredient of the present 
invention is administered orally, protein or other active ingredient of the present invention will 
be in the form of a tablet, capsule, powder, solution or elixir. When administered in tablet form, 

30 the pharmaceutical composition of the invention may additionally contain a solid carrier such as 
a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 95% protein or 
other active ingredient of the present invention, and preferably from about 25 to 90% protein or 
other active ingredient of the present invention. When administered in liquid form, a liquid 
carrier such as water, petroleum, oils of animal or plant origin such as peanut oil, mineral oil, 

35 soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the 
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pharmaceutical composition may further contain physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 
When administered in liquid form, the pharmaceutical composition contains from about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 
5 about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or 
other active ingredient of the present invention will be in the form of a pyrogen-free, parenterally 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or other 

10 active ingredient solutions, having due regard to pH, isotonicity, stability, and the like, is within 
the skill in the art. A preferred pharmaceutical composition for intravenous, cutaneous, or 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 
present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, 
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 

1 5 other vehicle as known in the art. The pharmaceutical composition of the present invention may 
also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 
skill in the art. For injection, the agents of the invention may be formulated in aqueous solutions, 
preferably in physiologically compatible buffers such as Hanks's solution, Ringer's solution, or 
physiological saline buffer. For transmucosal administration, penetrants appropriate to the 

20 barrier to be permeated are used in the formulation. Such penetrants are generally known in the 
art. 

For oral administration, the compounds can be formulated readily by combining the 
active compounds with pharmaceutically acceptable carriers well known in the art. Such carriers 
enable the compounds of the invention to be formulated as tablets, pills, dragees, capsules, 

25 liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be 
treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, 
optionally grinding a resulting mixture, and processing the mixture of granules, after adding 
suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 
particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose 

30 preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 
gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 
carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents 
may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt 
thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this 

35 purpose, concentrated sugar solutions may be used, which may optionally contain gum arabic, 
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talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be 
added to the tablets or dragee coatings for identification or to characterize different combinations 
of active compound doses. 
5 Pharmaceutical preparations which can be used orally include push-fit capsules made of 

gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 
sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 
lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, 
optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in 

10 suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. AH formulations for oral administration should be in dosages suitable 
for such administration. For buccal administration, the compositions may take the form of 
tablets or lozenges formulated in conventional manner. 

For administration by inhalation, the compounds for use according to the present 

1 5 invention are conveniently delivered in the form of an aerosol spray presentation from 
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g. , 
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 
other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 
providing a valve to deliver a metered amount. Capsules and cartridges of, e.g., gelatin for use in 

20 an inhaler or insufflator may be formulated containing a powder mix of the compound and a . 

suitable powder base such as lactose or starch. The compounds may be formulated for parenteral 
administration by injection, e.g., by bolus injection or continuous infusion. Formulations for 
injection may be presented in unit dosage form, e.g., in ampules or in multi-dose containers, with 
an added preservative. The compositions may take such forms as suspensions, solutions or 

25 emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, 
stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions of 
the active compounds in water-soluble form. Additionally, suspensions of the active compounds 
may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or 

30 vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
triglycerides, or liposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 
dextran. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
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solutions. Alternatively, the active ingredient may be in powder form for constitution with a 
suitable vehicle, e.g., sterile pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 
retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other 
5 glycerides. In addition to the formulations described previously, the compounds may also be 
formulated as a depot preparation. Such long acting formulations may be administered by 
implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic 
materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 

10 sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent 
system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution 
of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v 

15 polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1:1 with a 5% dextrose in water solution. This co-solvent 
system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 
without destroying its solubility and toxicity characteristics. Furthermore, the identity of the 

20 co-solvent components may be varied: for example, other low-toxicity nonpolar surfactants may 
be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other 
biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl pyrrolidone; and other 
sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for 
hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well 

25 known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents 
such as dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 
Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. 
Various types of sustained-release materials have been established and are well known by those 

30 skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 1 00 days. Depending on the chemical nature and the 
biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

The pharmaceutical compositions also may comprise suitable solid or gel phase carriers 

35 or excipients. Examples of such carriers or excipients include but are not limited to calcium 
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carbonate, calcium, phosphate, various sugars, starches, cellulose derivatives, gelatin, and 

polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 

provided as salts with pharmaceutically compatible counter ions. Such pharmaceutically 

acceptable base addition salts are those salts which retain the biological effectiveness and 

5 properties of the free acids and which are obtained by reaction with inorganic or organic bases 

such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, 

monoalkylamine, dibasic amino acids, sodium acetate, potassium benzoate, Methanol amine and 

the like. 

The pharmaceutical composition of the invention may be in the form of a complex of the 

10 protein(s) or other active ingredient(s) of present invention along with protein or peptide 

antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 
lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) following 
presentation of the antigen by MHC proteins. MHC and structurally related proteins including 

1 5 those encoded by class I and class II MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 
MHC-peptide complexes alone or with co-stimulatory molecules that can directly signal T cells. 
Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as 
well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 

20 pharmaceutical composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a liposome in 
which protein of the present invention is combined, in addition to other pharmaceutically 
acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 
micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable 

25 * lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, 
sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. Preparation of such 
liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. 
Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated 
herein by reference. 

.30 The amount of protein or other active ingredient of the present invention in the 

pharmaceutical composition of the present invention will depend upon the nature and severity of 
the condition being treated, and on the nature of prior treatments which the patient has 
undergone. Ultimately, the attending physician will decide the amount of protein or other active 
ingredient of the present invention with which to treat each individual patient. Initially, the 
35 attending physician will administer low doses of protein or other active ingredient of the present 
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invention and observe the patient ! s response. Larger doses of protein or other active ingredient 
of the present invention may be administered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further. It is contemplated that the 
various pharmaceutical compositions used to practice the method of the present invention should 
5 contain about 0.01 |xg to about 1 00 mg (preferably about 0. 1 ng to about 1 0 mg, more preferably 
about 0. 1 (ig to about 1 mg) of protein or other active ingredient of the present invention per kg 
body weight. For compositions of the present invention which are useful for bone, cartilage, 
tendon or ligament regeneration, the therapeutic method includes administering the composition 
topically, systematically, or locally as an implant or device. When administered, the therapeutic 

10 composition for use in this invention is, of course, in a pyrogen-free, physiologically acceptable 
form. Further, the composition may desirably be encapsulated or injected in a viscous form for 
delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 
active ingredient of the invention which may also optionally be included in the composition as 

1 5 described above, may alternatively or additionally, be administered simultaneously or 

sequentially with the composition in the methods of the invention. Preferably for bone and/or 
cartilage formation, the composition would include a matrix capable of delivering the , 
protein-containing or other active ingredient-containing composition to the site of bone and/or 
cartilage damage, providing a structure for the developing bone and cartilage and optimally 

20 capable of being resorbed into the body. Such matrices may be formed of materials presently in 
use for other implanted medical applications. 

The choice of matrix material is based on biocompatibility, biodegradability, mechanical 
properties, cosmetic appearance and interface properties. The particular application of the 
compositions will define the appropriate formulation. Potential matrices for the compositions 

25 may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 

hydroxyapatite, polylactic acid, polyglycolic acid and polyanhydrides. Other potential materials 
are biodegradable and biologically well-defined, such as bone or dermal collagen. Further 
matrices are comprised of pure proteins or extracellular matrix components. Other potential 
matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 

30 aluminates, or other ceramics. Matrices may be comprised of combinations of any of the above 
mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and 
tricalcium phosphate. The bioceramics may be altered in composition, such as in 
calcium-aluminate-phpsphate and processing to alter pore size, particle size, particle shape, and 
biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 

35 glycolic acid in the form of porous particles having diameters ranging from 150 to 800 microns. 
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In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 
cellulose or autologous blood clot, to prevent the protein compositions from disassociating from 
the matrix. 

A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses 
5 (including hydroxyalkylcelluloses), including methylcellulose, ethylcellulose, 

hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 
carboxymethylcellulose, the most preferred being cationic salts of carboxymethylcellulose 
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, 
poly(ethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and poly(vinyl alcohol). 

10 The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 
total formulation weight, which represents the amount necessary to prevent desoxption of the 
protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 
protein the opportunity to assist the osteogenic activity of the progenitor cells. In further 

15 compositions, proteins or other active ingredients of the invention may be combined with other 
agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in 
question. These agents include various growth factors such as epidermal growth factor (EGF), 
platelet derived growth factor (PDGF), transforming growth factors (TGF-ot and TGF-P), and 
insulin-like growth factor (IGF). 

20 The therapeutic compositions are also presently valuable for veterinary applications. 

Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 
patients for such treatment with proteins or other active ingredients of the present invention. The 
dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 
regeneration will be determined by the attending physician considering various factors which 

25 modify the action of the proteins, e.g., amountof tissue weight desired to be formed, the site of 
damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue (e.g., 
bone), the patient's age, sex, and diet, the severity of any infection, time of administration and 
other clinical factors. The dosage may vary with the type of matrix used in the reconstitution and 
with inclusion of other proteins in the pharmaceutical composition. For example, the addition of 

30 other known growth factors, such as IGF I (insulin like growth factor I), to the final composition, 
may also effect the dosage. Progress can be monitored by periodic assessment of tissue/bone 
growth and/or repair, for example, X-rays, histomorphometric determinations and tetracycline 
labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 
35 polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
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mammalian subject Polynucleotides of the invention may also be administered by other known 

methods for introduction of nucleic acid into a cell or organism (including, without limitation, in 

the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of 

proteins of the present invention in order to proliferate or to produce a desired effect on or 

5 activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

4.123 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to achieve its 

10 intended purpose. More specifically, a therapeutically effective amount means an amount 
effective to prevent development of or to alleviate the existing symptoms of the subject being 
treated. Determination of the effective amount is well within the capability of those skilled in 
the art, especially in light of the detailed disclosure provided herein. For any compound used in 
the method of the invention, the therapeutically effective dose can be estimated initially from 

15 appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 
circulating concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieve a circulating 
concentration range that includes the IC50 as determined in cell culture (z.e. 9 the concentration of 
the test compound which achieves a half-maximal inhibition of the protein's biological activity). 

20 Such information can be used to more accurately determine useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic 
efficacy of such compounds can be determined by standard pharmaceutical procedures in cell 
cultures or experimental animals, e.g. 9 for determining the LD50 (the dose lethal to 50% of the 

25 population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 
ratio between LD50 and ED50. Compounds which exhibit high therapeutic indices are preferred. 
The data obtained from these cell culture assays and animal studies can be used in formulating a 
range of dosage for use in human. The dosage of such compounds lies preferably within a range 

30 of circulating concentrations that include the ED50 with little or no toxicity. The dosage may 

vary within this range depending upon the dosage form employed and the route of administration 
utilized. The exact formulation, route of administration and dosage can be chosen by the 
individual physician in view of the patient's condition. See, e.g., Fing! et al., 1975, in "The 
Pharmacological Basis of Therapeutics", Ch. 1 p.l. Dosage amount and interval may be adjusted 

35 individually to provide plasma levels of the active moiety which are sufficient to maintain the 



73 



WO 01/57190 PCT/USO 1/04098 

desired effects; or minimal effective concentration (MEC). The MEC will vary for each 
compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend on individual characteristics and route of administration. However, HPLC assays or 
bioassays can be used to determine plasma concentrations. 
5 Dosage intervals can also be determined using MEC value. Compounds should be 

administered using a regimen which maintains plasma levels above the MEC for 1 0-90% of the 
time, preferably between 30-90% and most preferably between 50-90%. In cases of local 
administration or selective uptake, the effective local concentration of the drug may not be 
related to plasma concentration. 
10 An exemplary dosage regimen for polypeptides or other compositions of the invention 

will be in the range of about 0.01 jxg/kg to 100 mg/kg of body weight daily, with the preferred 
dose being about 0.1 ^g/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 
intervals. 

1 5 The amount of composition administered will, of course, be dependent on the subject 

being treated, on the subject's age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 

4.12.4 PACKAGING 

20 The compositions may, if desired, be presented in a pack or dispenser device which may 

contain one or more unit dosage forms containing the active ingredient. The pack may, for 
example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may 
be accompanied by instructions for administration. Compositions comprising a compound of the 
invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an 

25 appropriate container, and labeled for treatment of an indicated condition. 

4.13 ANTIBODIES 

Also included in the invention are antibodies to proteins, or fragments of proteins of the 
invention. The term "antibody" as used herein refers to immunoglobulin molecules and 

30 immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain 
an antigen binding site that specifically binds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, F a b, Fab' and F( ab -) 2 
fragments, and an F a b expression library. In general, an antibody molecule obtained from 
humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ from one another 

35 by the nature of the heavy chain present in the molecule. Certain classes have subclasses as well, 
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such as IgGi, IgG2, and others. Furthermore, in humans, the light chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to all such classes/ 
subclasses and types of human antibody species. - 
An isolated related protein of the invention may be intended to serve as an antigen, or a 
5 portion or fragment thereof, and additionally can be used as an immunogen to generate : 
antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal 
and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 
invention provides antigenic peptide fragments of the antigen for use as immunogens. An 
antigenic peptide fragment comprises at least 6 amino acid residues of the amino acid sequence 
10 of the full length protein, such as an amino acid sequence shown in.SEQ ID NO:985, and 

encompasses an epitope thereof such that an antibody raised against the peptide forms a specific 
immune complex with the full length protein or with any fragment that contains the epitope. 
Preferably, the antigenic peptide comprises at least 10 amino acid residues, or at least 15 amino 
acid residues, or at least 20 amino acici residues, or at least 30 amino acid residues. Preferred 
1 5 epitopes encompassed by the antigenic peptide are regions of the protein that are located on its 
surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
- antigenic peptide is a region of -related protein that is located on the surface of the protein, e.g., a 
hydrophilic region. A hydrophobicity analysis of the human related protein sequence will 
20 indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely 
to encode surface residues useful for targeting antibody production. As a means for targeting 
antibody production, Hydropathy plots showing regions of hydrophilicity and hydrophobicity 
may be generated by any method well known in the art, including, for example, the Kyte 
Doolittle or the Hopp Woods methods, either with or without Fourier transformation. See, e.g., 
■ 25 Hopp and Woods, 1981, Proc. Nat Acad ScL USA 78: 3824-3828; Kyte and Doolittle 1982, 1 
Mol Biol. 157: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 
fragments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 
. 30 thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

Various procedures known within the art may be used for the production of polyclonal or 
monoclonal antibodies directed against a protein of the invention, or against derivatives, 
fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A Laboratory 
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Manual, Harlow E, and Lane D, 1988, Gold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY, incorporated herein by reference); Some of these antibodies are discussed below. 



5.13.1 Polyclonal Antibodies 

5 For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, 

goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate . 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 
protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 

1 0 recombinantly expressed immunogenic protein. Furthermore, the protein may be conjugated to 
a second protein known to be immunogenic in the mammal being immunized. Examples of such 
immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, 
bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an 
adjuvant. Various adjuvants used to increase the immunological response include, but are not 

1 5 limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
; dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 
. adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 

20 synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and further purified by well known techniques, 
such as affinity chromatography using protein A or protein G, which provide primarily the IgG 
fraction of immune serum. Subsequently, or alternatively, the specific antigen which is the 

25 target of the immunoglobulin sought, or an epitope thereof, may be immobilized on a column to 
purify the immune specific antibody by immunoafifinity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28). 

30 5.13.2 Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a unique heavy chain 
gene product. In particular, the complementarity determining regions (CDRs) of the monoclonal 
35 antibody are identical in all the molecules of the population. - MAbs thus contain an antigen 
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binding site capable of immunoreacting with a particular epitope of the antigen characterized by 
a unique binding affinity for it. 

Monoclonal antibodies can be prepared using hybridoma methods, such as those - 
described by Kohler and Milstein, Nature. 256:495 (1 975). In a hybridoma method, a mouse, 
5 hamster, or other appropriate host animal, is typically immunized with an immunizing agent to 
elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind 
to the immunizing agent. Alternatively, the lymphocytes can be immunized in vitro. 
The immunizing agent will typically include the protein antigen, a fragment thereof or a fusion 
protein thereof. Generally, either peripheral blood lymphocytes are used if cells of human origin 

10 are desired, or spleen cells or lymph node cells are used if non-human mammalian sources are 
desired. The lymphocytes are then fused with an immortalized cell line using a suitable fusing 
agent, such as polyethylene glycol, to form a hybridoma cell (Goding, Monoclonal Antibodies: 
Principles and Practice. Academic Press, (1986) pp. 59-103). Immortalized cell lines are usually 
transformed mammalian cells, particularly myeloma cells of rodent, bovine and human origin. 

15 . Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can be cultured in 
a suitable culture medium that preferably contains one or more substances that inhibit the growth 
or survival of the unfused, immortalized cells. For example, if the parental cells lack the enzyme 
hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium for 
the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 

20 medium"), which substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high level 
expression of antibody by the selected antibody-producing cells, and are sensitive to a medium 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which 
can be obtained, for instance, from the Salk Institute Cell Distribution Center, San Diego, 

25 California and the American Type Culture Collection, Manassas, Virginia. Human myeloma and 
mouse-human heteromyeloma cell lines also have been described for the production of human 
monoclonal antibodies (Kozbor, J. Immunol.. 133:3001 (1984); Brodeur et al., Monoclonal 
Antibody Production Techniques and Applications . Marcel Dekker, Inc., New York, (1987) pp. 
51-63). 

30 The culture medium in which the hybridoma cells are cultured can then be assayed for 

the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
enzyme-linked immunoabsorbent assay (ELIS A). Such techniques and assays are known in the 

35 art. The binding affinity of the monoclonal antibody can, for example, be determined by the 
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Scatchard analysis of Munson and Pollard, Anal. Biochem.. 107:220 (1980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. 

' After the desired hybridoma cells are identified, the clones can be subcloned by limiting 
5 dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example, Dulbecco's Modified Eagle's Medium and RPMI-1640 medium. 
Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 
The monoclonal antibodies secreted by the subclones can be isolated or purified from the culture 
medium or ascites fluid by conventional immunoglobulin purification procedures such as, for 

10 example, protein A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, dialysis, or 
. affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DN A methods, such as 
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the 
invention can be readily isolated and sequenced using conventional procedures (e.g., by using : 

15 oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred . 
source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are 
then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or 
myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of 

20 monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 

example, by substituting the coding sequence for human heavy and light chain constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,816,567; Morrison, Nature 368 . 
812-13 (1994)) or by covalently j oining to the immunoglobulin coding sequence all or part of the 
coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin 

25 polypeptide can be substituted for the constant domains of an antibody of the invention, or can 
be substituted for the. variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 

5.13.2 Humanized Antibodies 

30 The antibodies directed against the protein antigens of the invention can further comprise 

humanized antibodies or human antibodies. These antibodies are suitable for administration to 
humans without engendering an immune response by the human against the administered 
immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 
immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab ! , F(ab')2 or other antigen- 

35 binding subsequences of antibodies) that are principally comprised of the sequence of a human 
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immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 
Humanization can be performed following the method of Winter and co-workers (Jones et al., 
Nature. 321:522-525 (1986); Riechmann et al., Nature, 332:323-327 (1988); Verhoeyen et al., 
Science. 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 
corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539.) In some 
instances, Fv framework residues of the human immunoglobulin are replaced by corresponding 
non-human residues. Humanized antibodies can also comprise residues which are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the 
humanized antibody will comprise substantially all of at least one, and typically two, variable 
domains, in which all or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the framework regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 
immunoglobulin (Jones et al., 1986; Riechmann et al., 1988; and Presta, Out. Op. Struct. Biol.. 
2:593-596 (1992)). 

5.13.3 Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light chain and the heavy chain, including the CDRs, arise from human 
genes. Such antibodies are termed "human antibodies", or "fully human antibodies" herein.. 
Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV hybridoma 
technique to produce human monoclonal antibodies (see Cole, et al., 1985 In: MONOCLONAL 
Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 
antibodies may be utilized in the practice of the present invention and may be produced, by using 
human hybridomas (see Cote, et al. 5 1 983. Proc Natl Acad Sci USA 80: 2026-2030) or by 
transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et al., 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, 
including phage display libraries (Hoogenboom and Winter, J. Mol. BioL 227:381 (1991); 
Marks et al., J. Mol. Biol.. 222:581 (1991)). Similarly, human antibodies can be made by 
introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
challenge, human antibody production is observed, which closely resembles that seen in humans 
in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach 
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is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in Marks et at (Bio/Technology 1,0, 779-783 (1992)): Lnnberg et aL 
fNature 368 856-859(1994)); Morrison ( Nature 368, 812-13 (1 994)): Fishwild et al,( Nature 
Biotechnology 14, 845-5 1 (1 996)); Neuberger (Nature Biotechnology 14, 826 (1 996)); and 
5 Lonberg and Huszar (Intern. Rev. Immunol. 13 65-93 (1 995)). 

Human antibodies may additionally be produced using transgenic nonhuman animals 
which are modified so as to produce fully human antibodies rather than the animal's endogenous 
antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 
endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host 

10 have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins : 
. are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing tfcle requisite human DNA segments. An animal which 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 
transgenic animals containing fewer than the full complement of the modifications. The 

1 5 preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse™ 
' as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B cells 
which secrete fully human immunoglobulins. The antibodies can be obtained directly from the ■ 
animal after immunization with an immunogen 'of interest, 2^, for example, a preparation of a 
polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as 

20 hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 

immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 
example, single chain Fy molecules. 

An example of a method of producing a nonhuman host, exemplified as a mouse, lacking 

25 expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 

5,939,598. It can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker; 

30 and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells , 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 
U.S. Patent No. 5,916,771. It includes introducing an expression vector that contains a 
nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing ( 

35 an expression vector containing a nucleotide sequence encoding a light chain into another 
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mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an 
antibody containing the heavy chain and the light chain. 

In a further improvement on this procedure, a method for identifying a clinically relevant 
epitope on an immunogen, and a correlative method for selecting an antibody that binds 
5 immunospecifically to the relevant epitope with high affinity, are disclosed in PCT publication 
WO 99/53049. 

5.13.4 Fab Fragments and Single Chain Antibodies 

, According to the invention, techniques can be adapted for the production of single-chain 
10 antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778). 
In addition, methods can be adapted for the construction of F a b expression libraries (see e.g., 
Huse, et al., 1989 Science 246: 1275-1281) to allow rapid and effective identification of 
monoclonal F a b fragments with the desired specificity for a protein or derivatives, fragments, 
analogs or homologs thereof. Antibody fragments that contain the idiotypes to a protein antigen 
15 may be produced by techniques known in the art including, but not limited to: (i) an F( a b-)2 

fragment produced by pepsin digestion of an antibody molecule; (ii) an Fab fragment generated 
by reducing the" disulfide bridges of an F^b^ fragment; (iii) an F a b fragment generated by the 
treatment of the antibody molecule with papain and a reducing agent and (iv) F v fragments. 

20 5.13.5 Bispecific Antibodies 

Bispecific antibodies- are monoclonal, preferably human or humanized, antibodies that 
have binding specificities for at least two different antigens. In the present case, one of the 
binding specificities is for an antigenic protein of the invention. The second binding target is any 
other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit. 

25 Methods for making bispecific antibodies are known in the art. Traditionally, the 

recombinant production of bispecific antibodies is based on the co-expression of two 
immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature . 305:537-539 (1983)). Because of the random' 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) produce a 

30 potential mixture of ten different antibody molecules, of which only one has the correct 

bispecific structure. The purification of the correct molecule is usually accomplished by affinity 
chromatography steps. Similar procedures are disclosed in WO 93/08829, published 13 May 
1993, and in Traunecker et al, 1991 EmO. J. 5 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 

35 combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 
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preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 
the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fusions. 
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin 
5 light chain, are inserted into separate expression vectors, and are co-transfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 
al. 9 Methods in Enzvmologv. 121:210 (1986). 

According to another approach described in WO 96/270 1 1 , the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which are 

10 recovered from recombinant cell culture. The preferred interface comprises at least a part of the 
CH3 region of an antibody constant domain. In this method, one or more small amino acid side 
chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 

15 acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield, of the heterodimer over other unwanted end-products such as homodimers. 

Bispecific antibodies can be prepared as full length antibodies or antibody fragments (e.g. 
F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 

20 prepared using chemical linkage. Brennan etaL. Science 229:81 (1985) describe a procedure 
wherein intact antibodies are proteolytically cleaved to generate F(ab') 2 fragments. These 
fragments are reduced in the presence of the dithiol complexing agent sodium arsenite to 
stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab 5 fragments 
generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB 

25 derivatives is then reconverted to the Fab'-thiol by reduction with mercaptoethylamine and is 
mixed with an equimolar amount of the other Fab'-TNB derivative to form the bispecific 
antibody. The bispecific antibodies produced can be used as agents for the selective 
. immobilization of enzymes. . 

Additionally, Fab' fragments can be directly recovered from E. coli and chemically 

.30 coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 175:217-225 (1992) describe 
the production of a fully humanized bispecific antibody F(ab') 2 molecule. Each Fab' fragment . 
was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 

35 of human cytotoxic lymphocytes against human breast tumor targets. 



82 



WO 01/57190 PCTYUS01/04098 

Various techniques for making and isolating bispecific antibody fragments directly from 
recombinant cell culture have also been described. For example, bispecific antibodies have been 
produced using leucine zippers. Kostelny et aL, 1 Immunol. 148(5): 1547-1 553 (1992). The 
leucine zipper peptides from , the Fos and Jun proteins were linked to the Fab' portions of two 
5 different antibodies by gene fusion. The antibody homodimers were reduced at the hinge region 
to form monomers and then re-oxidized to form the antibody heterodimers. This method can 
also be utilized for the production of antibody homodimers. The "diabody" technology 
described by Hollinger et al., Proc. Natl. Acad. Sci. USA 90:6444-6448 (1993) has provided an 
alternative mechanism for making bispecific antibody fragments. The fragments comprise a 

10 heavy-chain variable domain (V h) connected to a light-chain variable domain (Vl) by a linker 
which is too short to allow pairing between the two domains on the same chain. Accordingly, 
the Vh and Vl domains of one fragment are forced to pair with the complementary Vl and Vh 
domains of another fragment, thereby forming two antigen-binding sites. Another strategy for 
making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 

15 reported. See, Gruber et al., J. Immunol. 152:5368 (1994). 

Antibodies with more than two valencies are contemplated. For example, trispecific 
antibodies can be prepared. Tutt et al., J. Immunol. 147:60 (1991). 
Exemplary bispecific antibodies can bind to two different epitopes, at least one of which 
originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm of an 

20 immunoglobulin molecule can be combined with an arm which binds to a triggering molecule on : 
a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3 S CD28, or B7), or Fc receptors for 
IgG (FcyR), such as FcyRI (CD64), FcyRII (CD32) and FcyRIII (CD16) so as to focus cellular 
defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies can also 
be used to direct cytotoxic agents to cells which express a particular antigen. These antibodies 

25 possess an antigen-binding arm and an arm which binds a cytotoxic agent or a radionuclide 
chelator, such as EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody of interest 
binds the protein antigen described herein and further binds tissue factor (TF). 

5.13.6 Heteroconjugate Antibodies 

30 Heteroconjugate antibodies are also within the scope of the present invention. 

Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies 
have, for example, been proposed to target immune system cells to unwanted cells (U.S. Patent . 
... No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 92/200373; EP 03089). 
It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic 

35 protein chemistry, including those involving crosslinking agents. For example, immunotoxins 
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can be constructed using a disulfide exchange reaction or by forming a thioether bond. 
Examples of suitable reagents for this purpose include iminothiolate and methyl-4- 
mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 

> 

5 5.13.7 Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector function, so as 
to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine 
residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond 
formation in this region. The homodimeric antibody thus generated can have improved 

10 internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC). See Caron et al., J. Exp Med., 176: 1191-1 195 (1992) 
and Shopes, J. Immunol., 148: 2918-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifunctional cross-linkers as described in Wolff 
et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that 

15 has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 
See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

5.13.8 Immunoconjugates 

The invention also pertains to immunoconjugates comprising an antibody conjugated to a 
20 . cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active toxin of 
bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a 
radioconjugate). 

Chemotherapeutic agents useful in the generation of such immunoconjugates have been 
described above. Enzymatically active toxins and fragments thereof that can be used include 
25 diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from 
. Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin, 
Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins (PAPI, PAPII, and. 
PAP-S), mpmordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, gelonin, • 
mitogellin, restrictocin, phenomycin, enomycin, and the tricothecenes. A variety of 
30 radionuclides are available for the production of radibconjugated antibodies. Examples include 
. 212 Bi, I31 I, 131 In, 9 °Y,and 186 Re. 

Conjugates of the antibody and cytotoxic agent are made using a variety of bifunctional 
protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) propionate (SPDP), 
iminothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl adipimidate HCL), 
35 active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido 
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compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as 
bis-(p-diazonimnbeiizoyl)-ethylenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), 
and bis-active fluorine compounds (such as l,5-difluoro-2,4-dinitrobenzene). For example, a 
ricin immunotoxin can be prepared as described in Vitetta et al., Science, 238: 1098 (1987). 
5 Carbon- 14-labeled l-isothiocyanatobenzyl-3-methyldiethylene triaminepentaacetic acid (MX- 
DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the antibody. See 
WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
10 administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 
conjugated to a cytotoxic agent. 

4.14 COMPUTER READABLE SEQUENCES 

15 In one application of this embodiment, a nucleotide sequence of the present invention can 

be recorded on computer readable media. As used herein, "computer readable media" refers to 
any medium which can be read and accessed directly by a computer. Such media include, but 
are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM 

20 and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled 
artisan can readily appreciate how any of the presently known computer readable mediums can 
be used to create a manufacture comprising computer readable medium having recorded thereon 
a nucleotide sequence of the present invention. As used herein, "recorded" refers to a process for 
storing information on computer readable medium. A skilled artisan can readily adopt any of the 

25 . presently known methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present invention. 
A variety of data storage structures are available to a skilled artisan for creating a 
, computer readable medium having recorded thereon , a nucleotide sequence of the present 
■ invention. The choice of the data storage structure will generally be based on the means chosen 

30 to access the stored information. In addition, a variety of data processor programs and formats 
can be used to store the nucleotide sequence information of the present invention on computer 
readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially-available software such as WordPerfect and Microsoft Word, or 
represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, 

35 Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring 
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formats (e.g, text file or database) in order to obtain computer readable medium having recorded 
• thereon the nucleotide sequence information of the present invention. 

. By providing any of the nucleotide sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 
or 3949-3954 or a representative fragment thereof; or a nucleotide sequence at least 95% 
5 identical to any of the nucleotide sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 
3949-3954 in computer readable form, a skilled artisan can routinely access the sequence 
information for a variety of purposes. Computer software is publicly available which allows a 
skilled artisan to access sequence information provided in a computer readable medium. The 
examples which follow demonstrate how software which implements the BLAST (Altschul et 

10 al., J. Mol. Biol. 215:403-410 (1990)) and BLAZE (Brutlag et al., Comp. Ghem. 17:203-207 
(1993)) search algorithms on a Sybase system is used to identify open reading frames (ORFs) 
within a nucleic acid sequence. Such ORFs may be protein encoding fragments and may be 
useful in producing commercially important proteins such as enzymes used in fermentation 
reactions and in the production of commercially useful metabolites. 

15 As used herein, "a computer-based system" refers to the hardware means, software 

means, and data storage means used to analyze the nucleotide sequence information of the 
present invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currently available 

20 , computer-based systems are suitable for use in the present invention. As stated above, the 

computer-based systems of the present invention comprise a data storage means having stored 
therein a nucleotide sequence of the present invention and the necessary hardware means and 
software means for supporting and implementing a search means. As used herein, "data storage 
means" refers to memory which can store nucleotide sequence information of the present 

25 invention, or a memory access means which can access manufactures having recorded thereon 
- the nucleotide sequence information of the present invention. 

As used herein, "search means" refers to one or more programs which are implemented 
on the computer-based system to compare a target sequence or target structural motif with the 
sequence information stored within the data storage means. Search means are used to identify 

30. fragments or regions of a known sequence which match a particular target sequence or target . 
motif. A variety of known algorithms are disclosed publicly and a variety of commercially 
available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not limited to, 
Smith- Waterman, MacPattern (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A 

35 skilled artisan can readily recognize that any one of the available algorithms or implementing 
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software packages for conducting homology searches can be adapted for use in the present 
computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
readily recognize that the longer a target sequence is, the less likely a target sequence will be 
5 present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 
residues. However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of 
shorter length. 

10 As used herein, "a target structural motif," or "target motif," refers to any rationally 

selected sequence or combination of sequences in which the sequence(s) are chosen based on a 
three-dimensional configuration which.is formed upon the folding of the target motif. There are 
a variety of target motifs known in the art. Protein target motifs include, but are not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited 

15 to, promoter sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 

4.15 TRIPLE HELIX FORMATION 

In addition, the fragments of the present invention, as broadly described, can be used to 
20 control gene expression through triple helix formation or antisense DNA or RNA, both of which 
methois are based on the binding of a polynucleotide sequence to DNA or RNA. 
Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 15241 :456 (1988); and Dervan 
25 , et al., Science 251:1360 (1991)) or to fhemRNA itself (antisense - Olmno, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRG Press, Boca 
. Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
30 Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide. . 

4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression of 
35 one of the ORFs of the present invention, or homolog thereof, in a test sample, using a nucleic 
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acid probe or antibodies of the present invention, optionally conjugated or otherwise associated 
with a suitable label 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide 
5 for a period sufficient to form the complex, and detecting the complex, so that if a complex is 
detected, a polynucleotide of the invention is detected in the sample. Such methods can also 
comprise contacting a sample under stringent hybridization conditions with nucleic acid primers 
that anneal to a polynucleotide of the invention under such conditions, and amplifying annealed 
polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is 

10 detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise contacting 
a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient to form the complex, and detecting the complex, so that if a complex is detected, a 
polypeptide of the invention is detected in the sample. 

15 ^ In detail, such methods comprise incubating a test sample with one or more of the 

_ antibodies or one or more of the nucleic acid probes of the present invention and assaying for 
binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody witli a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 

20 employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 
skilled in the art will recognize that any one of the commonly available hybridization, 
amplification or immunological assay formats can readily be adapted to employ the nucleic acid 
probes or antibodies of the present invention. Examples of such assays can be found in Chard, 
T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, 

25 Amsterdam, The Netherlands (1 986); Bullock; G.R. et al, Techniques in Immunocytochemistry, 
Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice 
and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, 
Elsevier Science Publishers, Amsterdam, The Netherlands (1 985). The test samples of the 
present invention include cells, protein or membrane extracts of cells, or biological fluids such as 

30 , sputum, blood, serum, plasma, or urine. The test sample used in the above-described method 
will vary based on the assay format, nature of the detection method and the tissues, cells or 
extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to obtain a 
sample which is compatible with the system utilized. 
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In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to carry out the assays of the present invention. Specifically, the invention 
provides a compartment kit to receive, in close confinement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 
5 invention; and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of a bound probe or antibody. 

In detail, a compartment kit includes any kit in which reagents are contained in separate 
containers. Such containers include small glass containers, plastic containers or strips of plastic 
or paper. Such containers allows one to efficiently transfer reagents from one compartment to 

10 another compartment such that the samples and reagents are not cross-contaminated, and the 
agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test 
sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which 

15 contain the reagents used to detect the bound antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the 
primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of 
reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 

20 established kit formats which are well known in the art. 



4.17 MEDICAL IMAGING 

The novel polypeptides and binding partners of the invention are useful in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
25 invention is involved in the immune response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel et al., U.S. Pat. NO. 5,413,778. Such methods involve chemical attachment of 
a labeling or imaging agent, administration of the labeled polypeptide to a subject in a 
pharmaceutically acceptable carrier, and imaging the labeled polypeptide in vivo at the target 
site. 

30 

4.18 SCREENING ASSAYS 

Using the isolated proteins and polynucleotides of the invention, the present invention 
further provides methods of obtaining and identifying agents which bind to a polypeptide 
encoded by an ORP corresponding to any of the nucleotide sequences set forth in SEQ ED NO: 
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1-984, 1969-2952, 3937-3942 or 3949-3954, or bind to a specific domain of the polypeptide 

encoded by the nucleic acid. In detail, said method comprises the steps of: 

(a) contacting an agent with an isolated protein encoded by an ORF of the present 

invention, or nucleic acid of the invention; and 

5 (b) determining whether the agent binds to said protein or said nucleic acid. 

In general, therefore, such methods for identifying compounds that bind to a 

polynucleotide of the invention can comprise contacting a compound with a polynucleotide of 

the invention for a time sufficient to form a polynucleotide/compound complex, and detecting 

the complex, so that if a polynucleotide/compound complex is detected, a compound that binds 

10 to a polynucleotide of the invention is identified 

Likewise, in general, therefore, such methods for identifying compounds that bind to a 

polypeptide of the invention can comprise contacting a compound with a polypeptide of the 

invention for a time sufficient to form a polypeptide/compound complex, and detecting the 

complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 

15 polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can also 

comprise contacting a compound with a polypeptide of the invention in a cell for a time 

sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a 

receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 

20 sequence expression, so that if a polypeptide/compound complex is detected, a compound that 

binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 

activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 

activity observed in the absence of the compound). Alternatively, compounds identified via such 

25 methods can include compounds which modulate the expression of a polynucleotide of the 

invention (that is, increase or decrease expression relative to expression levels observed in the 

absence of the compound). Compounds, such as compounds identified via the methods of the 

invention, can be tested using standard assays well known to those of skill in the art for their 

ability to modulate activity/expression. 

30 . The agents screened in the above assay can be, but are not limited to, peptides, 

carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected . 

• and screened at random or rationally selected or designed using protein modeling techniques. 

For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and 

the like are selected at random and are assayed for their ability to bind to the protein encoded by 

35 the ORF of the present invention. Alternatively, agents may be rationally selected or designed. 
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As used herein, an agent is said to be "rationally selected or designed" when the agent is chosen 
based on the configuration of the particular protein. For example, one skilled in the art can 
readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationally designed 
5 antipeptide peptides, for example see Hurby et al., Application of Synthetic Peptides: Antisense 
Peptides," In Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1992), pp. 289-307, and 
Kaspczak et al., Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 

In addition to the foregoing, one class of agents of the present invention, as broadly 
described, can be used to control gene expression through binding to one of the ORFs or EMFs 

10 of the present invention. As described above, such agents can be randomly screened or 
rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single ORF or 
multiple ORFs which rely on the same EMF for expression control. One class of DNA binding 
agents are agents which contain base residues which hybridize or form a triple helix formation 

15 by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, 

ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have 
base attachment capacity. 

Agents suitable for use in these methods preferably contain 20 to 40 bases and are 
. designed to be complementary to a region of the gene involved in transcription (triple helix - see 

20 Lee et al., Nucl. Acids Res. 6:3073 '(1979); Cooney et aL, Science 241 :456 (1988); and Dervan et 
al., Science 25 1 : 1 360 (1 991)) or to the mRNA itself (antisense - Okano, I Neurochem. 56:560 
(1 991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 

25 polypeptide. Both techniques have been demonstrated to be effective in model systems. 

Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide and other DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the present invention can 
be used as a diagnostic agent Agents which bind to a protein encoded by one of the ORFs of the 

30 present invention can be formulated using known techniques to generate a pharmaceutical 
composition. 

4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subj ect invention is to provide for polypeptide-specific nucleic acid 
35 hybridization probes capable of hybridizing with naturally occurring nucleotide sequences. The 
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hybridization probes of the subject invention may be derived from any of the nucleotide 
sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. Because the 
corresponding gene is only expressed in a limited number of tissues,. a hybridization probe 
derived from of any of the nucleotide sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 
5 3949-3954 can be used as an indicator of the presence of RNA of cell type of such a tissue in a 
sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 
hybridization. PCR as described in US Patents Nps. 4,683,195 and 4,965,1 88 provides 
additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in 
10 PCR may be of recombinant origin, may be chemically synthesized, or a mixture of both. The 
probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include the 
cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 
.15 are known in the art and are commercially available and may be used to synthesize RNA probes 
in viti'o by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may 
be used to construct hybridization probes for mapping their respective genomic sequences. The 
nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a 
20 chromosome using well known genetic and/or chromosomal mapping techniques. These 

techniques include in situ hybridization, linkage analysis against known chromosomal markers, 
hybridization screening with libraries or flow-sorted chromosomal preparations specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 
chromosome spreads has been described, among other places, in Verma et al (1988) Human 
25 Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map data. Examples 
of genetic map data can be found in the 1 994 Genome Issue of Science (265 : 1 98 1 £). Correlation 
between the location of a nucleic acid on a physical chromosomal map and a specific disease (or 
. 30 predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, carrier or affected individuals. 
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420 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced 
using an automated oligonucleotide synthesizer. 
5 Support bound oligonucleotides may be prepared by any of the methods known to those of 

skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can be 
achieved using passive adsorption (Inouye & Hondo, (1 990) 1 Clin. Microbiol. 28(6) 1 469-72); 
using UV light (Nagata et al, 1 985; Dahlen et al, 1 987; Morrissey & Collins, (1 989) Mol. Cell 

10 Probes 3(2) 189-207) or by covalent binding of base modified DNA (Keller a/.; 1988; 1989);all 
references being specifically incorporated herein. 

Another strategy that may be employed is the use of the strong biotin-streptavidin 
interaction as a linker. For example, Broude et al (1994) Proc. Natl. Acad. Sci. USA 91(8) 3072-6, 
describe the use of biotinylated probes, although these are duplex probes, that are immobilized on 

1 5 streptavidin-coated magnetic beads. Streptavidin-coated beads may be purchased from Dynal, Oslo. 
Of course, this same Unking chemistry is applicable to coating any surface with streptavidin. 
Biotinylated probes may be purchased from various sources, such as, e.g., Operon Technologies 
(Alameda, CA). 

Nunc Laboratories (Naperville, IL) is also selling suitable material that could be used. Nunc 
20 Laboratories have developed a method by which DNA can be covalently bound to the microwell 
surface termed CovalinkNH. CovaLinkNH is a polystyrene surface grafted with secondary amino 
groups (>NH) that serve as bridge-heads for further covalent coupling. CovaLink Modules may be 
purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 
5-end by a phosphoramidate bond, allowing immobilization of more than 1 pmol of DNA 
25 (Rasnmssene/a/., (1991) Anal.Biochem. 198(1) 138-42). * 

The use of CovaLinkNH strips for covalent binding of DNA molecules at the 5-end has 
been described (Rasmussen et al., (1991). In this technology, a phosphoramidate bond is employed 
(Chu et al., (1983) Nucleic Acids Res. 1 1(8) 65 13-29). Hiis is beneficial as immobilization using 
only a single covalent bond is preferred. The phosphoramidate bond joins the DNA to the 
3 0 CovaLink NH secondary amino groups that are positioned at the end of spacer amis covalently 

grafted onto the polystyrene surface through a 2 nm long spacer arm. To link an oligonucleotide to 
CovaLink NH via an phosphoramidate bond, the oligonucleotide terminus must have a 5'-end 
phosphate group. It is, perhaps, even possible for biotin to be covalently bound to CovaLink and 
then streptavidin used to bind the probes. 
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More specifically, the linkage method includes dissolving DNA in water (7.5 ng/ul) and 
denaturing for 10 min, at 95°C and cooling on ice for 10 min. Ice-cold 0. 1 M 1-methylimidazole, 
pH 7.0 (1 -Melm 7 ), is then added to a final concentration of 1 0 niM 1 -Melm?. A ss DNA solution is 
then dispensed into CovaLink NH strips (75 ul/well) standing on ice. 
5 Carbodiimide0.2M l-ethyl-3-(3-dime%laminopropyl)-carbodi^ 

1 0 mM 1 -Melni7 5 is made fresh and 25 ul added per well. The strips are incubated for 5 hours at 
50°C. After incubation the strips are washed using, eig., Nunc-Immuno Wash; first the wells are 
washed 3 times, then they are soaked with washing solution for 5 min!, and finally they are washed 
3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS heated to 50°C). 

10 It is contemplated that a further suitable method for use with the present invention is that 

described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated herein by 
reference. This method of preparing an oligonucleotide bound to a support involves attaching a 
nucleoside 3-reagentthrough the phosphate group by a covalent phosphodiester link to aliphatic 
hydroxyl groups carried by the support. The oligonucleotide is then synthesized on the supported 

15 nucleoside and protecting groups removed from the synthetic oligonucleotide chain under standard 
conditions that do not cleave the oligonucleotide from the support. Suitable reagents include 
nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 
arrays may be employed. For example, addressable laser-activated photodeprotectionmay be 

20 employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described by 
Fodor et al. (1 99 1 ) Science 25 1 (4995) 767-73 , incorporated herein by reference. Probes may also 
be immobilized on nylon supports as described by Van Ness etal, (1991)Nucleic Acids Res. 
19(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1988) Anal. Biochem, 
169(1) 104-8; all references being specifically incorporated herein. 

25 To link an oligonucleotideto a nylon support, as described by Van Ness etal (1991), 

requires activation of the nylon surface via alkylation and selective activation of the 5-amine of 
oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 
light-generated synthesis described by Pease et al., (1994) PNAS USA 91(1 1) 5022-6, incorporated 

30 herein by reference). These authors used current photolithographic techniques to generate arrays of 
immobilized oligonucleotide probes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonucleotide probes in high-density, miniaturized arrays, utilize photolabile 
S'-protectedi^-acyl-deoxynucleosidephosphoramidites, surface linker chemistry and versatile 
combinatorial synthesis strategies. A matrix of 256 spatially defined oligonucleotide probes may be 

35 generated in this manner. 
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4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic 
DNA, chromosomal DN A, microdissected chromosome bands, cosmid or YAC inserts, and RNA, 
including mRNA without any amplification steps. For example, Sambrookef al (1989) describes 
5 three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 
9.14-9.23). 

DNA fragments may be prepared as clones in M 1 3 , plasmid or lambda vectors and/or 
prepared directly from genomic DNA or cDNA by PCR or other amplification methods. Samples 
may be prepared or dispensed in multiwell plates. About 1 00- 1 000 ng of DNA samples may be 
10 prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of skill 
in the art including, for example, using restriction enzymes as described at 924-9.28 of Sambrooke/ 
al (1989), shearing by ultrasound and NaOH treatment. 

Low pressure shearing is also appropriate, as described by Schriefer et al (1 990) Nucleic 
1 5 Acids Res. 1 8(24) 7455-6, incorporated herein by reference). In this method, DNA samples are 
passed through a small French pressure cell at a variety of low to intermediate pressures. A lever 
device allows controlled application of low to intermediate pressures to the cell. The results of these ' 
studies indicate that low-pressure shearing is a useful alternative to sonic and enzymatic DNA 
fragmentationniethods. 

.20 One particularly suitable way for fragmenting DNA is contemplated to be that using the two 

base recognition endonuclease,Cvz'JI, described by Fitzgerald etal (1992) Nucleic Acids Res. - 
20(1 4) 3753-62. These authors described an approach for the rapid fragmentation and fractionation 
of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and 
. . sequencing. 

25 The restriction endonuclease CviJI normally cleaves the recognition sequence PuGCPy 

between the G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of 
this enzyme (Cvz'JI**), yield a quasi-random distribution of DNA fragments form the small 
molecule ptJC 19 (2688 base pairs). Fitzgerald etal (1992) quantitatively evaluated the 
randomness of this fragmentation strategy, using a Cvi JI* * digest of pUC 1 9 that was size 
- 30 fractionated by a rapid gel filtration method and directly ligated, without end repair, to a lac Z minus 
M 1 3 cloning vector. Sequence analysis of 76 clones showed that CvzJI* * restricts pyGCPy and 
PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated at a rate 
consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 

35 agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead of 2-5 
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ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel 
electrophoresis and elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, it is 
important to denature the DNA to give single stranded pieces available for hybridization. This is 
5 achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. The solution is then cooled 
q\iickly to 2°C to prevent renaturationof the DNA fragments before they are contacted with the 
chip. Phosphate groups must also be removed from genomic DNA by methods known in the art. 

4.22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. 

10 Spotting may be performed by using arrays of metal pins (the positions of which correspond to an 
array of wells in a microtiter plate) to repeated by transfer of about 20 nl of a DNA solutiontoa 
nylon membrane. By offset printing, a density of dots higher than the density of the wells is 
achieved. One to 25 dots may be accommodated in 1 mm 2 , depending on the type of label used By 
avoiding spotting in some preselectednumber of rows and columns, separate subsets (subarrays) 

15 . may be formed. Samples in one subarray may be the same genomic segment of DNA (or the same 0 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 
subarrays may represent replica spotting of the same samples. In one example, a selected gene 
segment may be amplified from 64 patients. For each patient, the amplified gene segment may be in 
one 96- well plate (all 96 wells containing the same sample). A plate for each of the 64 patients is 

20 prepared. By using a 96-pin device, all samples may be spotted on one 8 x 12 cm membrane. 

Subarrays may contain 64 samples, one from each patient. Where the 96 subarrays are identical, the 
dot span may be 1 mm 2 and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, Illinois) 
which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane, the grid 

25 being similar to the sort of membrane applied to the bottom of multiwell plates, or hydrophobic 
strips. A fixed physical spacer is not preferred for imaging by exposure to flat phosphor-storage 
screens or x-ray films! 

The present invention is illustrated in the following examples. Upon consideration of the 
present disclosure, one of skill in the art will appreciate that many other embodiments and variations 
30 may be made in the scope of the present invention. Accordingly, it is intended that the broader 
aspects of the present invention not be limited to the disclosure of the following examples. The 
present invention is not to be limited in scope by the exemplified embodiments which are intended 
as illustrations of single aspects of the invention, and compositions and methods which are 
functionally equivalent are within the scope of the invention. Indeed, numerous modifications and 
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variations in the practice of the invention are expected to occur to those skilled in the art upon 
consideration of the present preferred embodiments. Consequently, the only limitations which 
should be placed upon the scope of the invention are those which appear in the appended claims. 
All references cited within the body of the instant specification are hereby incorporated by 
5 reference in their entirety. 

5.0 EXAMPLES 

5.1 EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained From Various Libraries 
A plurality of novel nucleic acids were obtained from cDNA libraries prepared from various 
1 0 human tissues and in some cases isolated from a genomic library derived from human chromosome 
. using standard PCR, SBH sequence signature analysis and Sanger sequencing techniques. The 

inserts of the librairy were amplified with PCR using primers specific for the vector sequences which 
. flank the inserts. Clones from cDNA libraries were spotted on nylon membrane filters and screened 
* with oligonucleotide probes (e.g., 7-mers) to obtain signature sequences. The clones were clustered 
' 1 5 into groups of similar or identical sequences. Representative clones were selected for sequencing. 

In some cases, the 5' sequence of the amplified inserts was then deduced using a typical 
Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied Biosystems 
(ABI) sequencer to obtain the novel nucleic acid sequences. In some cases RACE (Random 
20 . Amplification of cDNA Ends) was performed to further extend the sequence in the 5' direction. 

5.2 EXAMPLE 2 ' 
Assemblage of Novel Nucleic Acids 

The contigs or nucleic acids of the present invention, designated as SEQ ID NO: 1969-2951, 
and 3949-3954 were assembled using an EST sequence as a seed. Then a recursive algorithm was 

25 used to extend the seed EST into an extended assemblage, by pulling additional sequences from 
different databases (i.e., Hyseq's database containing EST sequences, dbEST version 1 14, gb pri 
11 4, and UniGene version 101) that belong to this assemblage. The algorithm terminated when 
there was no additional sequences from the above databases that would extend the assemblage. 
Inclusion of component sequences into the assemblage was based on a BLASTN hit to the 

30 extending assemblage with BLAST score greater than 300 and percent identity greater than 95%. 

Tables 6 and 8 sets forth the novel predicted polypeptides (including proteins) encoded by 
the novel polynucleotides (SEQ ID NO:2953-3936, and 3949-3954) of the present invention, and 
their corresponding nucleotide locations to each of SEQ ID NO: 2953-3936 and3955-3960. Tables 
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6 and 8 also indicates the method by which the polypeptide was predicted. Method A refers to a 
polypeptide obtained by using a software program called FASTY (available from 
http://fasta.bioch.virginia.edu > i which selects'a polypeptide based on a comparison of the translated 
novel polynucleotide to known polynucleotides (W.R. Pearson, Methods in Enzymology, 1 83 :63-98 
5 (1990), herein incorporated by reference). Method B refers to a polypeptide obtained by using a 
software program called GenScan for human/vertebrate sequences (available from Stanford 
University, Office of Technology Licensing) that predicts the polypeptide based on a probabilistic 
model of gene structure/compositionalproperties (C. s Burge and S. Karlin, J. Mol. Biol., 268:78-94 
(1997), incorporated herein by reference). Method C refers to a polypeptide obtained by using a 
1 0 Hyseq proprietary software program that translates the novel polynucleotide and its complementary 
strand into six possible amino acid sequences (forward and reverse frames) and chooses the 
polypeptide with the longest open reading frame. 

5.3 EXAMPLE 3 
Novel Nucleic Acids 

15 Using PHRAP (Univ. of Washington) or CAP4 (Paracel), full length gene cDNA sequences . 

and their corresponding protein sequences were generated from the assemblage. Any frame shifts 
and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genebank Other computer programs which may 
have been used in the editing process were phredPhrap and Consed (University of Washington) and 

20 ed-ready, ed-ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences are shown in the 
Sequence Listing as SEQ ID NO:l-35 1 . The amino acids are SEQ ID NO:985-1335. 
Table 1 shows the various tissue sources of SEQ ID NO : 1-351. 

The nearest.neighbor results for SEQ ED NO: 1-351 were obtained by a BLASTP version 
2.0al 19MP-WashU search against Genpept release 120 arid Geneseq October 12, 2000 release 

25 21 (Derwent), using BLAST algorithm.. The nearest neighbor result showed the closest 

homologue for SEQ ID NO: 1-351 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs - 
with identifiable functions for SEQ ID NO: 1 -3 5 1 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 

30 Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 
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Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
5 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process for 
identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also disclosed by 
10 Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the publication " 
Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites" 
Protein Engineering, Vol. 1 0, no. 1, pp. 1^6 (1997), incorporated herein by reference. A maximum 
S score and a mean S score, as described in the Nielson et as reference, was obtained for the 
polypeptide sequences. Table 7 shows the position of the signal peptide in each of the polypeptides 
15 and the maximum score and mean score associated with that signal peptide. 

5.4 EXAMPLE 4 
Novel Nucleic Acids 

Using PHRAP (Univ.. of Washington) or CAP4 (Paracel), a full length gene cDNA 
20 sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (Le. dbEST version 117, gbpri 117, 
. ,UniGene version 117, Genpept release 117). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready s ed- 
25 ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 352-766. The corresponding 
amino acids are SEQ ID NO: 1336-1750. 

Table 1 shows the various tissue sources of SEQ ID NO: 352-766. 
The nearest neighbor results for SEQ ID NO: 352-766 were obtained by a BLASTP 
30 version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release 21 (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 352-766 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs with 
identifiable functions for SEQ ID NO: 352-766 are shown in Table 2 below. 
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Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al, J. Comp. 
Biol., Vol. 6 pp. 219-235 (1 999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
5 the eMatrix p-vahie(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
10 within the sequence. ~ 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
1 5 disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
. was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
20 each of the polypeptides and the maximum score and mean score associated with that signal ' 
peptide. / 

• 5.5 EXAMPLES 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
25 sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. ^During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e.,.dbEST version 1 1 8, gb pri. 1 18, 
UniGene version 118, Genpept release 1 1 8) . Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
■ 30 ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 767-930. The corresponding 
amino acid sequences are SEQ ID NO: 1751-1914. 

Table 1 shows the various tissue sources of SEQ ID NO: 767-930. 
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The homology results for SEQ ID NO: 767-930 were obtained by a BLASTP version 
2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 release 
21(Derwent), using BLAST algorithm. The nearest neighbor result showed the homologs for 
SEQ ID NO: 767-930 from Genpept. The translated amino acid sequences for which the nucleic 
5 acid sequence encodes are shown in the Sequence Listing. The homologues with identifiable 
functions for SEQ ID NO: 767-930 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
10 signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. - 

Using the pFam software program (Sonnhammer et al. ^ Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
15 the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 

20 for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also " 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

25 was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide.. 

5.6 EXAMPLE 6 
Novel Nucleic Acids 

30 . Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 

. sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 1 1 8, gb pri 1 18, 
UniGene version 118, Genpept release 118). Other computer programs which may have been used 
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in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 93 1-965. The corresponding 
amino acid sequences are shown in SEQ ID NO: 1 91 5-1 949. 
5 Table 1 shows the various tissue sources of SEQ ID NO: 931-965. 

The nearest neighbor results for SEQ ED NO: 931 -965 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest . 
homologue for SEQ ID NO: 93 1-965 from Genpept . The translated amino acid sequences for 
10 which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 
with identifiable functions for SEQ ID NO: 93 1-965 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
15 signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res,, Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
20 the domain found, the description, the p-value and the pFam score for the identified domain, 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP V 1 . 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 

25 for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 

disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
- cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

30 was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.7 EXAMPLE 7 . 
Novel Nucleic Acids 
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Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length-gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 1 19, gb pri 1 19, 
5 UniGene version 119, Genpept release 1 1 9). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
. ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS:966-974. The corresponding 
amino acid sequences are SEQ ID NO: 1 950-1 958. 
10 Table 1 shows the various tissue sources of SEQ ID NO: 966-974. 

The nearest neighbor results for SEQ ID NO: 966-974 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest . 
homologue for SEQ ID NO: 966-974 from Genpept . The translated amino acid sequences for 
15 which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 
with identifiable functions for SEQ ID NO: 966-974 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
. examined to determine whether they had identifiable signature regions. Table 3 shows the 
20 signature region found in the indicated polypeptide sequences, the description of the signaiture, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
25 the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI . 1 program (from . 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 

30 for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 

disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaiyotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no: 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

35 was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
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each of the polypeptides and the maximum score and mean score associated with that signal 

peptide. 

5.8 EXAMPLE 8 
Novel Nucleic Acids 

5 . Using PHRAP (Univ. of Washington) ox CAP4 (Paracel), a full length gene cDNA 

sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 120, gb pri 120, 
UniGene version 1 20, Genpept release 1 20). Other computer programs which may have been used 
10 in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
. these procedures are shown in the Sequence Listing as SEQ ID NOS:975-984. The corresponding 
amino acid sequences are SEQ ID NO:1959-1968. 

Table 1 shows the various tissue sources of SEQ ID NO: 975-984/ 
1 5 The nearest neighbor results for SEQ ID NO: 975-984 were obtained by a BLASTP 

version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 21, 2000 
release (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ED NO: 975-984 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 
20 with identifiable functions for SEQ ID NO: 975-984 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
' Biol, Vol. 6 pp; 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicatedpolypeptide sequences, the description of the signature, 
25 the eMatrix p-value(s) and the positions) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1 998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
30 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
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disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 

publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 

cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

5 was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 

each of the polypeptides and the maximum score and mean score associated with that signal 

peptide. 

5.9 EXAMPLE 9 
Novel Nucleic Acids 

10 Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 

sequence and its corresponding protein sequence were generated from the assemblage. Any. frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 120,gbpri 120, 
UniGene version 120., Genpept release 120). Other computer programs which may have been used 

15 in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS:3937-3942. The 
correspondingpeptide sequence is SEQ ID NO: 3943-3948. 

Table 1 shows the various tissue sources of SEQ ID NO: 3937-3942. 

20 The nearest neighbor results for SEQ ID NO: 3937-3942 were obtained by a BLASTP 

version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release 21 (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 3937-3942 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 

25 with identifiable functions for SEQ ID NO: 3937-3942 are. shown in Table 9 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et aL, J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 10 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 

30 the eMatrix p-value(s) and the position(s) of the signature within' the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al.; Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 1 1 shows the name of 
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the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 12 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

Tables 5 and 1 3 are correlation tables of all of the sequences and the SEQ ID NOS. 



TABLE 1 



Tissue Origin 


RNA 
Source 


Library 
Name 


SEQ ID NOS: 


lung 






3 1 1 25 49 65 75 1 14 141 156 160 172 
1 90 1 98 209 2 1 7 224 229 234-23 5 267 
269 274 277 282 284 303 308 312 320 
334 336 352 372 396 398 412 414 437 
453 464 470 481 492-494 508-509 532 
539 581 584 617-619 621 628 633 643 
688 691 745 752 761 768 794 822 837 
848 876 887 953 967 973 


adult brain 


GIBCO 


AB3001 


1 3 12-13 16 22-24 28-29 41 48 58 65 78 
82 89-90 94 97 103 112 114-1 15 117 120 
122 130-131 168 181 184 186-187 189- 
190198 208 216 247 249 259 270 277 . 
297 301308 312 314 321 333 348 374 
396403 406 410 412 416-417 420 423 
426-427 431 456 474 481 484-485 488 
498 500 508-509 530 549 553 558 563- 
564 583 596 602-603 608 612 621-622 
624 643 650 674 699 71 1 736 738-739 
753 770 779-780 785-786 802-803 816 
822 839 842 848 859 861 871 893-894 
897 900 903 925 954 958 967 969 


adult brain 


GIBCO 


ABD003 


3 19 21-25 28-29 31 33-34 37 39 41 46-48 
53 58 63-64 66 72 78 80 99 103 109-1 10 
112 114 118 120-124 126 132-133 135 
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139 143 146 148-149 159 163 168 174 
176 179-180 184-185 188-190 202 208- 
209 216-217 221 223 230 234-235 240 
244 249 251 253 255 258-259 263 269- 
270 277 282 285-286 290 294-295 297. 
301-302 304-305 307-308 311-312 314 
320 329 333 335-336 342 344 346 349 
354 358 365 370 373-374 377 380 382- 
383 388 394-396 399 401-402 406 409- 
410 413 416 420-421 425 428 430-431 
436-437 442 456 462 464 466-467 474 
484 486 495-496 500-501 506 508-509 
519 530 537 542 549 561-562 564 572 
574 577-578 580-583 586-587 589 592- 
593 596-597 601 608 610 612-614 617- 
624 630-632 635 637 650 658 663-664 
668 676 679 681 689-690 693 699 724 
726 732 736 742-743 747 767-770 780 
784 789 793 799 802-805 813 817-818 
822 824 829-831 837 839 845 848 856 
859-860 864 871-872 875-876 881 887 
896-897 901 903 907 910-911 925 930 
933 943-944 947 952-953 958 962-963 
965 967 972 977 


advilt brain 


Clontech 


ABR001 


3 53 66 113 115 126 135 160 172 179 185 
204.263 273 305 312 323 358 380 383 
395-396 403 420 428-429 431 461 542 
583 586 606-607 611 620 645-646 688 
690 715 732 736 740 748 754 768 784- 
786 790 796 800 878 897 906-907 947 
977 


adult brain 


Clontech 


ABR006 


19 32 49 53 60 72 91 103 118 125 130- 
131 134 184 224 275 338 350 354 361- 
363 374 384 390 394 396 431-432 434- 
435 445 468 549 621 732 734-736 745 
760-761 764 768-769 775 787 806 811 
818 887 903 906 918 930 942 947 957 
973 977 " 


adult brain 


Clontech 


ABR008 


2-3 9-11 14 17 21 23-25 28-29 31-35 37 
41-42 45 47-48 56-57 65-66 69-70 72 75 
77-78 88 91-92 97-99 1 01 103 112-115 
118-128 130-131 135 138-140 142 144- 
146 148 152 156-157 159-160 163 168 
172 174 176 178-180 182-190 194 196- 
198 200-201 204 209-214 218 220-225 
228-230 232-233 238-240 243-244 246 
254-256 260-264 270 272-274 278-279 
282-285 289-291 293-294 296-297 301 
303-306 312-314 317 321-322 325-328 
334 336 338 340-342 344 346 348 350- 
352 354 356-358 363 366 369-374 376 
379-381 383-386 388-394 398-399 402- 
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- 






403 405 409-412 414 41 8-421 423-424 - 
426-427 430 433-437 443 445-450 452 
456-457 460 462 464 471 479 482-483 
485 488 490-498 505 507 510 516 519- 
522 524 527-532 535 538-539 542-545 
548 551 553 555 561-562 566 569 571 
574 580-583 588-589 593 597 601-608 
611-612 614-615 617-618 621-622 624 
630-635 642 644 646-648 650-652 655 
657 659-661 664-665 668 672 674 689 
693-699 701-702 708 71 1 715 717 724 
728-730 732 734-735 738-740 745 747- 
750 753-755 757 761 763-764 766-769 
772-773 775 780-781 789-791 793-795 
799-800 802-806 809 812 818-819 821- 
822 826 829-830 832 834-835 841 843 \ 
845 856 858-859 861 864 866 870 872 
876 880 883 885 887 893-898 902 906- 
916 918 921 925-926 930-931 933 942- 
943 946 948 950-951 953-954 958-960 
962-965 967 969-970 972 977 


adult brain 


Clontech 


ABR011 


57 196 270 304 344436 834 


adult brain 


BioChain 


ABR012 


14 82 121-122 168 691 


adult brain 


Invitrogen 


ABR013 


72 108 263 270 336 425 492-494 732 787 
790 826 880 


adult brain 


Invitrogen 


ABR014 


293 394 399 764 768-769 928 967 


adult brain 


Invitrogen 


ABR015 


738-739 764 


adult brain 


Invitrogen 


ABR016 . 


320 374 396 399 405 684 742-743 767 
931 947 967 


adult brain 


Invitrogen 


ABT004 


21 33-34 37-38 47 52 57-58 69 72 91-93 
109 119 122-124 126-127.135 142-143 
158 167-168 185-188 194 200 212 232 
242 246 255 258 270 277 279 293 301 
312-313 319 322-323 331 341 346 348 
371 374 388 391 394 399 401 409 41 1 
429 436-437 456 462 477 488 496 498 
510 512. 515 539 542 545 549 559 563 
573 579 587 589 601-605 612 620-621 
624 640 643 647 681 715 723 728 732 
735-736 740 745 748 753 766 785-786 
792-793 797-801 812 822 829-831 853- 
856 859 876-877 884 893-894 908-909 
918 925 933 950 969 978 


cultured 
preadipocytes ■ 


Strategene 


ADP001 


4 28-29 69 93 114 121 132-133 135 151- 
152 159 167 172 178 181 184 190 194- 
195 203-204 209 217 219 240 248 260- 
262 267 273-274 277 282 297 301 304 
312 314 326-327 361-362 371 374 388 
394 401 403 405 41 1 420 437 453 466- 
467 470 474 478 496 507-509 517 530 
532-533 584 588 593 602-603 608 610 
617-621 630-631 633 639 642-643 661 
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693 729 746.761 765 769 834 842 848 
887 907 923 947-950 957 967 969 



adrenal gland 



Clontech 



ADR002 



1 3 12-13 21 23-24 27-29 67 74 78 103- 
105 108-109 113 115 118 120-121 128- 
133 149 156 160 172 177 182 214 217 
223 232-233 247 254 269-270 273-274 
277 283 285 288 298-299 308 317 319 
328 338 340 342 361-362 364 372 376- 
377 382 384 401-402 405-406 416 420 
431 437 444 446 448 457 462 484 500 
507 517 524 532-533 539 545 554 561- 
562 564 588 597 602-603 606-607 635 
642 646 649 658 664 674 693 703 730 
740 745 752 759 765 767 775 779 799 
809 817-818 839 845 856 859 863 887 
890-891 896 948 953 958 961-963 973 



adult heart 



GBCO 



AHR001 



1 3-4 8 10 14 20-21 25 28-29 33-34 37-38 
41 48 54-57 65 69-72 75 78 80 82-83 97 
99-100 108 112-115 117-121 123-124 
128-133 141 144-146 149 152 159 162- 
163 168 172 176 179 181 184 186-187 
190-191 201 203 208-209 212 216-218 
221 223 227 229 233 244 247 249 253- . 
255 258 263-264 267 269-270 274 278 
280-282 285 289 291 295 297-299 301 
303-304 308 313 317 321-322 326 328 
334. 344 348 352 358 361-363 370-371 
380 382-383 388 394-396 398 401 403 
405-406 410-416 423 425-427 430-431 
436 452-453 464-465 470-474 481-484 
487-488 490 492-494 496 499-500 505- 
506 508-509 514 523 529-530 533 547- 
548 553 558 563-565 577-578 586-588 
590 593 597 601-603 606-608 610-613 
617-619 621-622 626-628 637-63 8 642- 
644 652 658 661 672 682-683 688 691 
693 697 699 708 71 1 713 715 732 737 
745 747-748 750-753 759 761 765 768- 
770 775 790 802-803 814-815 818-819 
830 837 839-840 842 845 848 859 861- 
862 867 876-877 887 89 1-892 896 900- 
901 903 905-906 908-909 919-920 922 
925 928 936 939-940 946-947 950 953 
959 967 970-971 973 977 



1.3 8 12-14 17 19-25 28-29 33-34 37-39 
41 46-48 50 52 55-60 62 65-67 69 71-72 
75 77-78 82 84 89-90 93 97 108-110 114- 
116 118-121 123-125 128 130-133 135 
138 144 146 149 156 159-161 163-164 
167-172 176 179 184 186-187 189-190 
194 196 200-202 204 209 21 1-212 216- 
217 219 221 223-224 229 232-235 244 



adult kidney 



GIBCO 



AKD001 
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247 250 253 255-256 258 263-264 268- 
272 274 277-281 283 286 288-290 292 
294-295 297 301 303-309 311-314 316 
319-323 325 328-338 342 348-349 352 
354-355 358 361-363 365 370-371 373 
376-378 380 382-383 388 395-399 401- . 
403 405-406 409-413 416 418-420 425- 
428 430-431 440 442 452-454 462 464- 
465 470 472-474 477 479 481 483-485 
487-489 492-495 498-500 504 506 5 1 0 
517 522 525 529-530 532-533 539 542- 
543 547 551-552 558 560-564 569-570 
573-574 577-578 580-583 585-590 594- 
596 601-608 610-613 617-621 624 626- 
628 630-631 634-636 639 642-643 648 
652 656 658 664-665 676-677 679 681 
688-691 693 697 699 708 711 715 717 
720-722 724 729-732 738-741 747-748 
751-753 761 765 770-778 780 784 789 
791 793 797 804 813 817 823-824 834 
837 839 842-843 845 848 859 861-862 
864 867 870 876-877 887 889 892-894 
896-897 900-901 903 907 913-915 918 
921 923 925 929-930 932 939 942 946- 
947 949-950 953 958-959 961-963 967 
969 972 977 


adult kidney 


Invitrogen 


AKT002 


1 3 16 21 30 32 35 38-41 46-47 56 77 92 
109 123-124 130-131 146 149 161 167- 
168 172 176 190 209 212 234-235 258 
279 292 301 303 308 314 333 355 363 
372 380 383 396 399 402 418-419 426- 
427 431 448 454 461 471-474 488-489 
495 498 504 506 508-509 520-521 530 
537 539-541 545 547 563 582-583 592 
613 617-618 621 623-624 633 655 688 
690 693 699 704 713 732 745 752-753 
761 766-768 770 784 789 797 837 842 
848-849 866-867 877 887 893-894 903 
914-915 925 929-930 937 944-945 947- 
949 955 961 967 984 


adult lung 


GIBCO 


ALG001 


1 3 14 18 28-29 38 54-56 59 92 110 114- 
115 130-131 146 149 156 159 164 167 
176 184 209 217 234-236 240 255-256 
258 263-264 269 271 276 280-281 297 
305 308 312 314 322 325 332 336 344 
353 361-362 388 401 410 420-421 426- 
427 431 465 469 474 484 498 500 506 
508-509 517 530 532 573 592 596 613 
619-620 623 626-628 638 658 679 681 
684 689 717 731 741 771 791 799 817 
834 845 861-862 864 875-876 901 921 
925 928 932 940 947 949 959 962-963 
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967 


lymph node 


Clontech 


ALN001 . 


3 10 110 146 160 168 196 209 221 269 
278 301 336 348 394 405 411 420 422 
459 464 474 485 503 506-507 532 563 
582 619 623 630-631 642 669 684 697 
713 715 727 747 767 769 789 825 839 
842 849 887 896 913 921 925 


young liver 


GIBCO . 


ALV001 


3 14 16 37-38 41 51 56 60 97 104-105 
108110117 119 128 130-131 134 139 
149 152 169-172 176 184 189-190 200 
209 212 216 218 228 232 255 258 263 
270-271 275 285-286 292 295 298-299 
301 304 314 341 358 365 368 376 400 
410-412 431 474 481-482 485 496 500 
504-505 517 520-522 524 530 532-533 
547 551 563 581 583 610-611 621 624 
635 643 691 708 71 1 715 720 752 755 
761 768 796-797 811 818 830 845-847 
852 864-865 867-869 896 899 910-911 
949 958 965 969 972-973 


adult liver 


Invitrogen 


ALV002 


3 37 42 56 60 71 82 104-105 114-115 
117-118 125 130-131 134-135 164 169- 
172 176 179 200 203-204 212 217 223 
226 232 237 244 263 274-275 292 301 
310-312 314 317 349 354 364 368 372 
376 398-399 402 426-427 439 442 451 
458 465 474 482 485 490 506 515 525 
527 545 547 552 568 571 573-575 582 
587 594-595 604-605 608 610 621 630- 
631 634-635 637 657 664 690 693 699 
723 726 745 75 1 763 767 784 793 8 11 
822 845 848 852 856 861-862 864 892 
899 908-909 925 950 958 967 983 


adult liver 


Clontech 


ALV003 


60 134 169-171 275 


adult ovary 


Invitrogen 


AOV001 


1 3 9-10 12-14 16 18 20 22-25 28-29 33- 
35 37 39 41-42 46 48-50 55-57 59 63-67 
69 71-72 75 77-80 82 88-89 92 101 103- 
106 108-110 113 115 119-121 123-126 
128-133 135 138 142-146 149 151-152 
159-161. 167-168 172 174 176-177 179 
181 184-190 194 198 200 203 208-209 
211-212214 217 219 221 224 226 232- 
235 240-242 246-247 249 25 1 254-255 
258-259 264 269-271 274 276-277 279- . 
283 285 288 290 293-294 297 301-304 
306-308 311 314 319-322 325-326 328- 
329 331-332 335-338 341-342 344 348 
354-358 361-363 365 368 370-372 374 
376 379-380 382-383 388 394-396 398- 
399 401-402 405-406 409-412 416 418- 
421 423 425-433 43.8 442-443 449-452 
454 462 464 466-467 469-471 474 479 
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482-484 488 490 492-496 498 500-504 
506-509 511 515-518 520-524 529-530 
532-533 537 539-542 545 551 555 558 
560-565 569 571 573 577-578 581-583 
585-590 592-593 596-597 600-605 608 
610-611 613-614 617-628 633-637 639 
642-643 646-648 650 652 654 656 658 
664 668-670 672 674 679 681 684 688 
691 693 697-699 701-702 713 717 721- 
722 724 729-732 738-744 747-750 752- 
753 755 759 761 765 767-774 779-780 
783-784 789 793 795-797 801 813-818 
823-824 828 830-832 834 837 839 841- 
842 845 848-851 856 859 862 864 866- 
867 870-871 874-878 881-883 887-889 
891 893-894 896-897 901 903 906-91 1 
913 919-922 925 928 930 936 939-940 
943-944 946-947 949-950 952-953 955 
957-958 962-963 965 967 969 971 973 
977 981-982 


adult placenta 


Invitrogen 


APL001 


41 56 67 253 301 304 334 380 383 451 
474 479 500 577-578 643 648 729 767 
856 859 866 873 962-963 


placenta 


Invitrogen 


APL002 


3 21 31 38 63-64 78 135 143 168 186-187 
212 232 244 263 280-281 334 336 344 
348 371 374 394 399 461 490 582 588 
602-607 610 620 699 745 769 793 817 
822 859 897-898 923 928 931 943 949 
969 973 


adult spleen 


GIBCO 


ASP001 


1 3 21-22 46 52 54-55 57-58 61-62 72 74 
78 82 88 118 121 130-131 137 152 159 
168 172 189 203 209 217 223 234-235 
252 255 263 269 271 274 282 288 290 
301 314 322 335 350 363 394 403 405- 
406 410-412 415 431 459 464 472-474 
482 488 500 506 510 514 517.532 537 
542 561-563 589 593 602-603 610 613 
619 621 636 642-643 655 658 662 674 
676 679 681-682 684 689 691-692 697 
699 715 720 723 729 747-748 769-770 
782 793 818 830 834 845 856 859 862 
877 887 893-894 896 903 906-907 914- 
915 918 925 928 930 940 946 965 967 
977 982 


testis 


GDBCO 


ATS001 


6 22 28-29 33-34 41 48 52 62 65 72 97 
106 109 118 132-133 145-146 168 172 
176 183 185 189-191 195 209 211-212 
214 221 223 230 254-255 258 263 269 
2b3 297 312 314 321 342 352 361-362 
365 380 383 388 395 401 405-406 412 
430-431 441 469-470 474 479 495-496 
500 506 520-521 533 543 545 548 560 
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563 574 582 589-590 593 608 616-618 
620 623-624 638 642-643 697 699 708 
711 745 747-748 765 767-768 779 784 
789 812-813 834 837 839 848 859 862 
868-869 875-877 887 889 893-894 896 
928 944 947 953-955 972 981 


Genomic DNA 
from BAC 
63118 


Research 
Genetics 
(CITB BAC 
Library) 


BAC001 


515 


Genomic DNA 
from BAC 
39316 . 


Research 
Genetics 
(CITB BAC 
Library) 


BAC002 


640 


Genomic DNA 
from BAC 
39316 


Research 
Genetics 
(CITB BAC 
Library) 


BAC003 


640 


adult bladder 


Invitrogen 


BLD001 


50 55 66 71 111 143-144 148 160 201 209 
223 255-256 280-281 286 305 315 319 
340 394 431442 488 497 505 518 552 
588-589 621 636 664 676 715 738-739 
769 790 824 837 845 877 887 936 940 
948 962-963 967 . 


bone marrow. 


Clontech 


BMD001 


3 10-13 16 18 20-21 25 28-29 31-34 41 45 
48 52 54-55 57 59 61 65 67 72-73 75 78 
80 82 84 99 103 108 110 114-115 118- 
120 123-124 128 130-133 143-144 148' 
152 159-161 163 168 172 174 176 178 
190 192 198 203 209 211 217-218 221 
223-224 227 233-236 244 247 249 252 
254 258 260-262 267-269 272 278 280- 
281 284-285 288 290 294-297 301 304 
308 314 317-318 320-321 325 328-330 
333-335 349 .351-354 358 363 365 367 
377 382 388.394-397 400 405 408 410- 
412 41 8-421 425-428 43 1 433 435 442 
449-450 453 455 459 464 468-470 474 
478-479 481 484 490 496 504 506 508- 
509 511 519-521 530 532 539 553 558- 
559 561-563 580 582 586 592 599 608 
610 613-614 617-619 623 625-628 635 
638 641-643 658 664 672 682 699 711 
713 717 731 734 740 742-743 745 761 
768-771 774 776-778 784 787 789 813 
817-818 822 834 839-840 842 848 862 
866 870 876 885-887 891 896-898 900 
903 906 913 919 921-922 927-928 939 
944 947 950 953 959 961-963 967-968 
970 973 977 


bone marrow 


Clontech 


BMD002 


3 9-10 15-19 30 33-34 39 45 54 57 63-64 
71 82 102 116 119 130-133 148 152 156 
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159-160 168 176 182 224 254-255 271- 
272 282 285 290 297-299 301 305 323 
333 340 344 351-355 358 361-362 364 
367 370 372 387 394-395 399 403 405 
409 41 1 449-450 459 461 468 474 488- 
489 524 530 532 580-582 592 602-603 
611 617-618 621-622 630-632 642 661 
663 694 717 730 734 740 745 752 755 
761 767 769-771 775-778 784 787 81 1 
813 818 832 840 842 849 859 878 887 
893-894 896-898 903 906 908-909 923 
928 944 946-949 953 958-963 965 982 


bone marrow 


Clontech 


BMD004 


54 


bone marrow 


Clontech 


BMD007 


766 887 928 


adult colon 


Invitrogen 


CLN001 


22 37 67 97 117 121 148-149 168 172 190 
200 204-205 232 244 263 268 292 301- 
302 363 377 384 452 455 459 470 530 
582 602-603 619 687 723 728 751 761 
831 861 887 914-916 934 955 969 984 


Mixture of 16 
tissues- 

mKJNAS . 


Various 
Vendors* 


CTL016 


358 740 760 


Mixture of 16 
tissues- '■.=■. 
mRNAs* 


Various 
Vendors* 


CTL021 


468 527 928 


adult cervix 


BioChain 


CVX001 


1 3 10 14 22 28-30 37 41 47-48 51-52 54- 
57 71 82 89-90 92 106 108 110-111 117- 
118 121 129-131 135 141 143-146 160- 
161 164 168 172 177 189-190 193 195 
200 204 209 211-212 217 226 229-230 
232 234-235 240-242 246 254 260-263 
268-270 274 277 282 285 292 295 297 
305-308 314-316 319 328 343-344 348 
354 358 363 368 380 382-384 389 394 
396 399 401 405-407 410 416 418-421 
428 430-43 1 437 442 453-454 459 464 
469 471-473 476 480 484 492-495 500 
504 506-509 516-517 526 530 532 545 
550-551 563-565 569 577-578 585-586 
590 608 611 613 619 621 623 628 630- 

^1 £1A £1H /Cyl 1 CA1 CAO CCC O ££A 

Oil 0i4-p3/ 041 643 648 656-658 664- 
665 674 679 682 689-690 693 700 703 
708 713 721-722 724 728 732 742-743 . 
747 750 752 755 757 761 763 767-769 



* The 16 tissue-mRNAs and their vendor source, are as follows: 1) Normal adult brain mRNA (Invitrogen), 2) 
normal adult kidney mRNA (Invitrogen), 3) normal adult liver mRNA (Invitrogen), 4) normal fetal brain mRNA 
(Invitrogen), 5) normal fetal kidney mRNA (Invitrogen), 6) normal fetal liver mRNA (Invitrogen), 7) normal fetal 
skin mRNA (Invitrogen), 8) human adrenal gland mRNA (Clontech), 9) human bone marrow mRNA (Clontech), 
10) human leukemia lymphablastic mRNA (Clontech), 11) human thymus mRNA (Clontech), 12) human lymph 
node mRNA (Clontech), 13) human spinal cord mRNA (Clontech), 14) human thyroid mRNA (Clontech), 15) 
human esophagus mRNA (BioChain), 16) human conceptional umbilical cord mRNA (BioChain). 
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779-780 784 788 810-811 813-815 822 
834 836-837 839 848 861 866-867 871 
874 877 887 891-894 897-898 901 913 
916 919 921-922 925 946-947 953 958- 
959 967 969 973 


diaphragm 


BioChain 


DIA002 


3 39 184 203 431 563 848 967 


endothelial 
cells 


Strategene 


EDT001 


3 6 8-10 14 19-24 28-29 33-34 37 39 41 
46 48 52 55-58 62-65 67 69 71-72 75 78 
80 82-83 87 101-102 108-109 114-115 
117 123-124 128 130-133 135 138 143 
145-146 149 156 159-160 167-168 172 
174 176-177 179 181 184-187 189-190 
194-195 200 203 208-209 212 216-217 
219 223-224 226-227 229 234-235 244 
248-249 254-256 258 263-264 267 269 . 
271 274 276-282 285 290-291 294 297 
301-304308 311 313-314 316-317 320- 
321 323 325-326 328-329 331-332 334-. 
337 339-341 344 348-349 352 354-355 
358 361-363 365 367 371-372 375 379- 
380 383 389 394-395 398-403 405-406 
409-412 425-428 437 442-443 448 454 
464 466-467 474 479 481490492-498 
500 503 506-509 511 517 520-521 523- 
524 530 532 537 540-542 558 561-563 
565 569-570 573 581-583 586 588-589 
596 .602-608 610-61 1 613 617-622 625 
628 630-63 1 633-637 642-643 646 648 
650 652 659 661-662 682 688 690-693 
696 698-699 708 712 715 717 720-722 ' 
724 727 729 740 745 748-750 752 761 
765 767-770 772-773 779 784 789 792- 
794 756 802-803 811 817-818 821 824 
827-828 830 834-835 837 842 845 848 
859 861-862 864 866-867 870 876 885 
887 891 893-894 897-898 900 903 906- 
907 913 916 921 925 939 947 950 953 
955 957-958 962-963 967 973 978 984 


Genomic 
clones from the 
short arm of 
chromosome 8 


Genomic 
DNA from 
Genetic 
Research 


EPM001 


324 515 64Q 


esophagus 


BioChain 


ESO002 


97 103 128 371 474 


fetal brain 


Clontech 


FBR001 


67 129 156 159 232 267 433 446 503 845 
952 


fetal brain 


Clontech 


FBR004 


28-29 185 213 277 350 384 432 485 501 
549 65 1 747 754 761 780 787 848 870 
887 906 958 


rexai Drain 


Clontecn 


rBROOo 


10-11 14 21 30 32 47 49 56 65 69 72 77- 
78 82 84 97101 115 118 121 125 128 
130-131 138 142 148 152 159-160 179 
185 188 194 197 203 210212214 219 
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222 227-229 243-246 249 252 256 264 
270 273 282 285 290-291 293 301-303 
305-306 312 321-322 325 327 339-340 
344 346 350 354-357 363 367-371 374 
388 391 394-395 399 402 405-406 410 
414 420 426-427 436-437 442 444 454 
456-457 460 462 464 470 480 485 492- 
494 507 510 516 524 528 530-532 539- 
542 549 553-554 561-562 580-582 588- 
589 602-608 61 1 615 617-619 621-622 
624 632 636 641-642 646-647 651-653 
661-662 666-669 672 677 691 715-716 
730 735 740 752 754 761 767-770 772- 
775 780-781 799-801 808 818 822-823 

835 843 845 856 850 864 867 876 880 " 
ojj oHj o*tj oj\j OJ7 oOn OOt o/O ooU 

RR5 RR7 ROfl 803-804 806 01 3 01 R 096 
942 946-947 951 957-959 962-963 970- 
971 


fetal brain 


Clontech 


FBRs03 


130-131 312 517 637 691 738-739 


fetal brain 


Invitrogen 


FBT002 


3 22 28-31 47 57 63-64 72 75 77-78 86 
94-95 97-98 126-127 135 140 143 156 
159-160 167-168 177 185 190 196 201 
203-204 214 217 230 254-255 258 267 
273-274 277 279 282-283 292 301-302 
305 312 314 323 329 346 348 367 374 
382 394 399 401 403 412 415 420 432 
437 474 482 485 495 507 513 517 527 
529-530 539-542 548 552 579 587-588 
600 604-605 612 617-618 621-622 624 
634 642-643 647-648 650 679 689 693 ■ 

60Q 717 71 5 747.74/* 745 748 74Q 753 

76R-760 703 707 870-83 1 834 845 848 
/uo i\jy tyj iy i oz*s qj i ojt o*tj o*to 

856 859 893-894 908-909 913 916 931 
933 940 950 967 969 


fetal heart 


Invitrogen 


FHR001 


19 57 130-131 394 431 642 769 844 


fetal kidney 


Clontech 


FKD001 


3 3 1 33-34 38 48 54 72 1 60 208-209 21 1 
223 264 269277 283 290 313 325 341 
348 358 396 418-420 474 484 506 508- 
500 517 590.571 537 547 553 558 567 

569 587 596 608 610 613 619 622 626- 
627 642 679 734 745 818 843 887 896 
903 916 969 971 


fetal kidney 


Clontech 


FKD002 


19 474 726 903 


fetal kidney 


Invitrogen 


FKD007 


3 1 18 186-187 230 244 271 432 887 969 


xciai lung 




pt finn i 


£Q 1 3 9.1 3 3 1 16% 90K Of)Q 9 1 7 9£7 9£Q 

274-275 286 354 394 396 406 462 483- 
484 608 619 751 769 771 834 914-915 
925 


fetal lune 


Invitroeen 


FLG003 


3 8 28-29 32 39 50 66 82 88 92 1 68 1 86- 
187 200 204 212 226 229 246 274 309 
327 332 368 374 382 394 398 426-427 
431-432 442 485 536 555-557 587 604- 
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605 621 624 636 642-643 661 677-678 
724 753 769 848 859 864 877-878 896 
902 904 914-915 958 


fetal lung 


Clontech 


FLG004 


130-131 394 664 769 942 


fetal liver- 
spleen . 


Columbia 
University 


FLS001 


3 8-10 12-13 16-17 19-25 27-29 33-35 37- 
38 41 45-46 48 52 55-58 60-67 69 71-74 
77-78 80 82 84 87-90 104-106 108-109 
112-121 123-125 128-134 138 141 143- 
146 149 151 156 159 163-164 167-172 
174 176-179 181 184 186-188 190 194- 
200-201 203 208-209 21 1-212 216-217 
219 224-227 229-230 232 234-235 237 
241 243-244 246-248 254-255 258 260- 
263 267 269-270 273-282 284-285 288- 
290 292-295 297-299 301-306 308 311- 
318 320-323 326 328 332 335 341-344 
348 352 354-359 361-365 367-368 371- 
374 376-380 382-383 388-389394-396 
398-399 401-41 1 413-414 416 418-421 
425 428-430 432-433 437 439 442-444 
449-450 452 456-457 461-470 472-474 
478-479 481-482 484-485 487 490-494 
497-499 504-507 511 514-515 517-521 
523-524 526 529 532 537 540-541 547 
555 558-559 563 575 577-578 580-596 
598-599 601-603 606-608 610-613 617- 
624 626-628 630-631 634-636 639 642- 
643 647-648 654-656 663-665 672 674- 
675 679 681 684 686 688 691 693-699 
711 713 715 717 719-726 729 732-733 
738-740 745 748-749 751-753 757 759 
761 767-770 776-778 780 784 787 792- 
794 799 804 809 811 813 817-819 822- 
825 830-831 834 837 840 842 845-848 • 
852 856 859 861-862 865 867-869 871 
874-878 887-888 891 893-894 896-900 
903 905-91 1 913 916 918 923 928 930- 
931 936 939 942 944 946-950 952 958- 
959 961-963 965 967 969-970 972-973 
976-977 981-983 


fetal liver- 
spleen 


Columbia 
University 


FLS002 


3 8-13 15-17 19-20 22 25 28-29 33-35 37 
41 45-46 52 54-56 60-61 63-64 66-70 73- . 
74 78 80 82 92 99 104-106 108-109 112 
115-116 118 120-121 123-125 128 132- 
135 139 141 143-144 146 149 152 156 
159-161 167 169^172 174 176-177 179 
181 185 188 190 194 196-197 200204 
212 214 216-218 223-224 226-230 232- 
235 237 246-247 252 254-255 258-263 
267 270-277 284-286 288 292 294-295 
297-299 301 303-305 308 310 314 318 
320 323 328 330-332 335-337 340 342- 
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344 352 354-355 358 361-365 367-368 
371 373-374 376-377 382 388 394-396 
398-399 401 405-406 409-411 413 418- 
421 429 431 439-440 442-444 451-452 
457 462r463 466-468 470 474 477-479 
481 483-484 487-488 491 495 499 504 
508-509 516 519-521 524 526-528 530 
532 537 540-541 543 545-547 550-551 
553 555 560 564 568 574-575 577-578 
580-592 596-597 600 602-603 608 610- 
611 613-614 617-618 621-622 628 630- 
631 634 637 639 642 644 647 654 658- 
659 665-667 669-675 679 681 684-685 
688-690 693 695 697 708 711 713 715 
717-719 723-727 729 731-734 738-739 
741 745-746 749-750 753 759 761 766- 
767 769-770 776-779 782 784 791-792 
794 805 .808 817-818 822 824-825 830 
834 837 842 845-849 852 856 859 864- 
865 867 874-878 888 891-892 896-900 • 
903 905-906 908-909 913 916 918 921 
923 925 932 936 939-940 942 944 946- 
947 949-950 953 955-956 958-959 961- 
963 965 968-970 973 977-978 981 


fetal liver- 
spleen 


Columbia 
University 


FLS003 


19 60 78 224 273 275 370 373-374 401 
602-603 639 643.730 732 738-739 748 
.752 770 782 928 930 947 949 


fetal liver 


Invitrogen 


FLV001 


37 55 60 69 72-73 97 104-105 108 113- 
114 116-118 121 135 143 152 167-168 
186-187 195 200-201 209 217 223 240' 
244 253 255 275 284 301 311 314 317 
336 342 348-349 358 371 374 382 394 
402 411-412 418-419 428 430.442 453 . 
517 568-569 580 582 584 587 589 601- • 
603 606-608 617-618 624 634 639 642- 
644 646 664-665 669 679 715 717 720 
726 745 748 751-769-770 782 791 794 
797 824 830-831 845-847 852 859 870 
899 913-916 925 928 948 956 958 969 
976 982 


fetal liver 


Clontech 


FLV002 


72 418-419 632 


fetal liver 


Clontech 


FLV004 


3 160 169-171 355 367 374 376 547 617- 
618 621 646 717 741 771 836 878 976 


fetal muscle 


Invitrogen 


FMS001 


15 27 32 37 67 72 83 99 112 121 138 167 
174 177 186-187 190 203-204 211 215 
230 252 259 312 374 403 406 409 457 
461 485 505 517 528 530 540-541 544 
549 554 558 579-580 583 602-603 608 
639 642-643 654 664 699 715 730 737 
751 772-773 788 802-803 810 848 856 
859 864 868-869 887 893-894 905-906 
910-911 923 948 967 
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fetal muscle 


Invitrogen 


FMS002 


15 99 130-131 223 361-362 431 474 505 
581 639 643 666-667 784 790 808 810- 
811 874 880 887 903 946 950 958 962- 
963 973 


fetal skin 


Invitrogen 


FSK001 - 


3 6 20-22 32-34 41-45 47 49-52 55 63-64 
66 69 77 80 88 91 98 101 111-112 115 
126 130-131 135 142 144 146 160 163 
167 176 188-190 196 201 204 208 213 
215 217-218 229 232 244 246 248 255 
263 265-269 274 279-281 283 285 288 . 
292 294 297 301 303 308 314 321 341- 
342 344 348 354-355 358 361-362 366 
369 371-372 374 381-382 384 386 394 
401 403 405 413 415 428 431 437440 
460 466-467 472-473 477 481 483 495 
499 504 517 522 532 536-537 539-541 
545 556-558 569 574 576-578 580 584- 
585 587-589 592-593 602-603 606-608 
612 617-618 621 624 634 637 639 642- 
643 647 664 673-674 676 680-681 689 
699 705-707 709-715 724 728-730 738- 
740 745 748 752 765 768-769 772-773 
793 797 817 823 830 834 842 848 859 
861 864 870 874 883 887-888 893-894 
901 904 908-909 913-916 923 925 947 
950 958 962-964 967 975 


fetal skin 


Invitrogen 


FSK002 


3 130-131 146 194 306 354 367 400 405 
474 489 520-521 547 558 561-562 585 
596 730 740 748 755 767 771 810 840 
893-894 946 959 


fetal spleen 


BioChain 


FSP001 


276 563 842 


umbilical cord 


BioChain 


FUC001 


3 20 33-34 39 48 50 52 55-57 65 67 69 72 
77 79 82 92 109 112-113 121 132-133 
138-143 156 167-168 172 174 179 184- 
185 190 194-196 200 202-203 208-209 
229-230 244 269-271 278 284-285 290 
297-299 303 305 308 320 331-332 336 
338 342-343 363 367 372 374 379-380 
383-384 392-394 397 399 402 405-406 
410 425-427 429-430 449-450 474 476 
484 497 499 501 504-505 510 515 517 • 
532-533 539 549 551 558 563 569 574 
577-578 581 586-587 597 602-603 608 
610 617-619 621 626-627 634-637 639 
642-643 658 663-664 674 690-691 693- 
694 699 713 715-717 720 724 726 729 
738-739 746-747 749 759 761 765 768- 
769 774-775 793 797 807 818 822 837 
848-849 856 862 868-869 874 885 887 
892-894 903 906-907 916-917 919-920 
928 936 939 944 946-947 962-963 967 
969 
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fetal brain 


GEBCO 


HFB001 


3 9-10 12-14 16 21 25 28-30 32-34 37-39 
41 47-48 52-53 56 65 67 69 71-72 75 80 - 
84 92 97 103 106 110 114 117-119 123- 
124 127 129 132-133 135 138 141-142 
144-146 148-149 152 156 159-160 168 
172 174 176 179 181 184-185 190 198 
208-209 212 214 219 221 223-224 229- 
230 233-236 240 244 247 251 253-255 
258-259 270 273 276-277 285 297 304- . 
305 308 312 314 322-323 325 328 332- 
333 335-337 339-340 342-344 346 352 
354 358 363 365 370-372 374 382 394- 
396 398 401 403 405-406 409-412 414 
416 425-427 43 1-432 437 442 445 453 
.456 462 466-467 469-470 472-474 479 
483 488 490 492-497 500-501 504 506- 
510 520-521 524 530 537 539 545 549 
552 558 560-562 564 569 579 582-583 
586-587 596 602-608 610-612 614 617- 
624 626-628 630-631 633 635 638 641 
643 647-648 656 658 661 676 679 688- 
689 693 696-697 71 1-712 715 724 726 
731 735 745 747-749 752 754 761 765 
767-770 774 779-781 784-786 789 799- 
800 802-803 813 818-819 823-824 831 
834-835 837 839 845 848 859 864 866- 
867 871 874-875 881 887 891 893-894 - 
896-897. 900 906-907 910-911 918 921- 
922 925 927-928 930 943-944 946-947 
950 953 962-963 965 969 972-973 977 


macrophage 


Invitrogen 


HMP001 


86 168 186-187 297 537 608 681 761 845 

877 . 


infant brain 


Columbia 
University 


IB2002 


2-3 9-10 12-14 16 21 25 27-30 32 37-38 
46-47 49 55-56 58 65 69 71-72 78-79 82 
84-86 91-92 98-99 106 109-110 113-115 
118 127-128 130-133 135 138 142 144 
151 156 168 173-176 180-181 185-188 
192 194 196-201 203 208 210-212 214 
217-218 224 229-231 233 236 238 240- 
241 244 246 25 1 -256 259 263 270-271 
277-279 284-285 287 293-294 296 301- 
302 308 312-314 317 322-323 327 330 
333 339 342 345-346 351 354 358 361- 
362 365-366 368 370-371 373-374 382 
388 394-396 402 405-406 411-412 415- 
416 420 424-425 428 431 436-437 440- 
441 444-445 453 456 460 465 474 479 
482-483 488 495-496 498 501 503-504 
506-510 515-517 520-521 524-525 529 
531-532 534-535 537 539-542 544-545 
549 561-562 569 574 577-578 580-583 
586-587 589 592 596 600-608 610 612- 
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613 616-618 620 622 624 629-632 634- 
635 637 641 643-644 650-651 653 661 
663-664 676-677 689 693 695-698 708 
711 720-722 724 730 732 735 740 745-. 
748 754 765-766 768-769 779-781 785- 
786 789 791 796 798 800-803 807 811- 
813 818-819 822-824 830-831 834-835 
837 839 842-843 845 854 856 858 864 
867-869 875-877 879 881 887 892-894 
896 903 907-911 913 916 919-920 925 
930-932 936 939 943 946-947 953 958 
970-973 977-978 982 984 



infant brain 



Columbia 
University 



IB2003 



3 12-13 21 27-29 
113 116126128 
176-177 184-185 
224 228 230 244 
276 293-294312 
346 354-355 358 
394 396 399 402 
474 482 484 488 
524 529 540-541 
589 596 600-603 
620-621 632 647 
735-736746 751 
800 807 811-813 
834 838-840 843 
919-920 925 930 
973 982 > 



32 39 49 69 72 82 91 
132-133 142 144 156 
188 194 208 212223- 
255 259 267 270273 
320 326-327337 342 
361-363 382 388 390 
420425 431442 462 
495-496 510 520-522 
549 563 582 586 588- 
606-607 612 617-618 
650 679 720-722 724 
754769 785-786 793 
818-819 822 824 831 
856 864 892 896 907 
-931 936 947.950 957 



infant brain 



Columbia 
University 



IBM002 



16 47 82 84 201 263 302 376 394 421 440 
488 537 592 606-607 635 740 769 887 
892 906 921 926 971 



infant brain 



Columbia 
University 



IBS001 



84 86 180 185 198 201 203 230 279 312 
326 346 354 366 388 488 542 581 588 
620 647 664 732 740 785-786 801 807 - 
822 827 910-911925 931 



lung, fibroblast 



Strategene 



LFBO01 



3 1125 49 65 75 
190 198 209 217 
269 274 277 282 
334 336 352 372 
453 464 470481 
539 581 584 617- 
688 691 745 752 
848 876 887 953 



114141 
224 229 
284 303 
396 398 
492-494 
-619 621 
761 768 
967 973 



156 160 172 
234-235 267 
308 312 320 
412 414 437 
508-509 532 
628 633 643 
794 822 837 



lung tumor 



Invitrogen 



LGT002 



1 3 9-10 12-13 20 31 38 41 4648 51-52 
56 58 63-64 72 74-75 78 82 88 101 106- 
107 110 114-115 117-118 120-121 123- 
124 128-133 135 143-146 149 151 156 
159-161 163-164 167-168 172 176 178- 
179 184-185 189-191 194-196 200203 
209 212 216-217 226 228-229 232 234- 
236 241 246 248 256 258-259 263-264 
269-271 274 282-283 285-286 290 292 
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294 297 301 308-309 311 314 317 321 
326 328-329 33 1 333-334 341 348 352 
354-355 363 365 371 380 382-383 388 
394-395 398-402 405-406 410-41 1 413 
416 418-419 426-427 439 442 452-453 
458-459 461-462 464-465 470-471 474 . 
478 483-484 490 495-496 499 510 522 
524 528 536-537 540-541 543 548 556- 
558 560-565 571-573 580 582 587-588 
592 597 602-605 608 610 612-613 617- 
622 625-629 633-634 636 642-644 648 
661 664 669 679 688-689 691 693 699- 
700 708 717 723-724 730 733-734 738- . 
740 745 747 749 752-753 761 767-768 
770 779 782 784-786 789 793-794 797 
817-818 820 823-824 834 837 842 845 . 
848 855 857 859 862 864 866 870 875- 
877 887 892 896 900-901 907-909 914- 
915 91 9-920 923-925 939 943 947 949 
953 958 962-963 965 968 970 972-973 ' 
977 


lymphocytes 


ATCC 


LPC001 , 


3 9-11 32 47 50 56 71 75 88 97 99 102 
121 125 128-129 135 138 141 149 163 
167-168 212-213 217 233 255 290 294 
301 305 311 314 342 372 377 388 398- 
399 410 437 442 453 470 474 481 495 
500 506 510 529 532 537 542 558 571 
579 604-605 610 620 628 637 643 658 

y y y y y mm y mm y y mm r\ y £\ ^9 f\ f\ yv 4 f\. mm** W 

666-667 676 679 697 708 713 728 730 
734 749 765 768 796 807 818 822 834 
839 848 859 875 885 887 896 903 906 
914-915 928 947 973 981-982 


leukocyte 


GIBCO 


LUC001 


1 3 9 11 18-19 21 23-25 27 31-34 39 41- 
42 46-48 52 54-58 62-69 71-72 74-75 78- 
80 82 89-90 93 99 110 115-121 123-124 
128-133 135 138 141 143-146 149 152 
156 159-161 163 167-168 176 179 181 
186-187189-190 194 198 200 203-204 
209 211-212 218^219 226 232-236 240 , 
244 247 25 1 253-255 258-259 263-264 
269 271 274 278-279 282-283 285 288- 
290 294-295 297 301-306 311 313-314 
317 320-321 325 328 330-331 335 337 ' 
342 344 348 350-351 353-354 358-359 
361-365 368 371-372 375 388-389 394- 
395 397-401 403 405 407 409-412 421 
425^27 432 437 442 448-450 452 457 
460-461 468-471 474 476 479-482 484 
4y2-4y4 4yo-4y8 500 506-510 516-517 
520-521 524 529-530 532 537 540-544 
551 553-554 558 560-565 569 577-578 
580-583 586-587 589 592 596-597 602- 
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603 606-608 610-624 626-628 630-631 
634-635 641-643 654 657-658 661 663- 
665 669 672 677 679 684-689 691 696- 
697 699 708 711 713 715 717 721-724 
728 730 738-740 747-749 755 761 765 
767-769 771 774-779 782 784 789 791- 
792 794-795 797 807-808 811-815 817- 
818 822 824 828 830 832 834 839-840 
842 845 848 856 859 862 864 867 871 
875-877 887 891 893-894 896-898 903 
906-91 1 913-916 921 923 925 927-928 
930 932 935-936 939 943-944 947 949- 
950 953 958-959 961-963 965 967 972- 
973 982 


leulcocyte 


Clontech 


LUC003 


1 41 82 106 119 123-124 160 177 184 201 
212 221 228 271 279 285 295 321 325 
; 372 394 41 1-412 443 468-470 530 532 
537 551 569 580-581 613 619 623 626- 
627 642 655 697 761 767 769. 775 789 
809 867 887 923 928 950 


melanoma 
from cell line 
ATCC#CRL 
1424 


Clontech 


MEL004 


3 25 55-56 67 71 78 109 121 129 146 167 
172-173 176 200 209 212 258-259 263 
278 297 301 306 312 335 338 340 352 
361-362 367 388 395 402 410 418-419 
429 437 454 464-465 48 1 496 500 503 
507 524 532 539 560-562 581-582 587 
589 599 612-613 617-621 623 643 657 
663-664 672 71 5 724 748 752 761 767- 
768 770 785-786 789 835 848 877 887 
896 916 919-920 947 967 978-980 


mammary 
gland 


Invitrogen 


MMG001 


1 14 19 21 28-29 31-37 47 49-51 55 57 
63-67 69 71-72 75-78 92 108-109 111 116 
121 123-124 126 128 130-133 135 143- 
144 148-150 156 159 164 168 172 177- 
179 184 186-187 190 194 200-204 209 
212 217 226 230 232-236 241 244 246- 
247 252 255 258-259 263 268 270 275 
279-283 285 290 292-293 301 304-305 
311 313-314 317 320 322-323 326-327 ! 
330 332 338 342-344 348-349 354 360 
363 367 371 374 380 382-383 385 388 
394-395 398 401-403 407 409 41 1-412 
418-420 426-427 430 435 437 442 449- 
453 459 461 465-468 470 474 477-478 
480 483 485 488 498 500 503-504 507 
515 519 522 524 529-532 538-541,544 
547 555 560 563 565 569 573-574 579- 
580 582 584 587-589 593 597 601-610 
612-613 615-618 620-622 624 634 636- 
637 639 642-644 646-647 650'657 663- 
664 674 676 679 688-689 691 693 696 
701-703 713 715 717 728 730 732 738- 
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739 741-743 745 749 751 753 763 767 
769 772-773 785-786 793 796-797 812 
821-824 830-833 837 848 856 859 861 
864 868-870 876-877 887 891 893-894 
898 903-904 907-91 1 913-918 921 923 
925-926 930-931936 942 949-950 958 
961 966-967 969 972-973 


induced neuron 
cells 


Strategene 


NTD001 


9 65 82 92 106 113 142 146 156 172 176 
191 208 221258 277 328 333 346 361- 
362 371-372 375 388 410 414 418-419 
440 471 484495 516 524 529-530.592 
610 628 642 650 745 748 752 761 793 
818 848 851 897 


retinoid acid 
induced neuron 
cells 


Strategene 


NTR001 


19 87 184 305 385 440 474 626-627 643 
748 799 834 977 


neuronal cells 


Strategene 


NTU001 


19 33-34 42 70 82 87 109 115 126 146 
172 185 188 194 212 255 269 274 283 
312 317 329 340 361-362 367 379 394 
399 401 410 420 426-427 474 479 507 
530 579 582-583 610 617-618 636 643 
658 732 740 765 769 784 791 793 799 
802-803 8 1 8 842 85 1 864 897 907 932 


pituitary gland 


Clontech 


PIT004 


3 19 123-124 194 255 354 358 373-374 
377 426-427 462 492-494 635 785-786 
793 893-894 


placenta 


Clontech 


PLA003 


138 176 574 896 972 


prostate 


Clontech 


PRT001; 


3 9 16 57 65 75 83 108 130-134 138 141 
■146 149-150 159 182 186-187 190 203 
209 234-235 276 283 322 413 415 442 , 
449-450 453 480 484 490 499-500 503 
505-506 523 537 543 564 583 602-603 
611 619 623 643 650 697 711 729 761 
765 770 776-778 784 789 819 822 831 
839 862 866 887 904 907 921 935 962- 
963 967 973 


rectum 


Invitrogen 


REC001 


19 30 33-34 66 108-109 123-124 126 129- 
.131 143 149 151 156 164 190 201 240 
247 250 263 268 274 279 287 295 298- 
299 310 314 332 341 354 384 394 401 
. 420 425 442 446 459 483 485' 520-521 
532 545 559 580-581 584 592 602-607 
610 612 615 619 634 637 646 655 664 
683-684 741 769 793 822 870 908-911 
914-916 934 937-938 942 967 973 982 


salivary gland . 


Clontech 


SAL001 


16 68 74 84 121 123-124 156 172 190 203 
209 232 248 254 269 292 294 363 377 
395 398 400 402 405-406 410 430 442 
459 462 474 483 485 563-564 579 587- 
588 599 602-603 643 658 699 728 730 
737 741 748 794 822 867 876 897 903 . 
981 
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salivary gland 


Clontech 


SALs03 


217 254 270 388 610 


skin fibroblast 


ATCC 


SFB001 


517 949 


skin fibroblast 


ATCC 


SFB002 


269 688 


skin fibroblast 


ATCC 


SFB003 


3 203 897 907 


small intestine 


Clontech 


SIN001 


3-4 47 57 68-69 92 99 125-126 130-131 
135 149 151-152 156 159 185 204 241 
246 291-292 318-319 338 343 348 363 
373 375 382 388-389 392-394 397 400 
437 466-467 471 484 500 517 520-521 
525 547 560 580-581 588 599 602-603 
612 624 643 711 731 733-734 757 761 
769 774-775 794 824 864 904 906 910- 
911 913 948 953 959 976 984 


skeletal muscle 


Clontech 


SKM001 


15 75 135 146 172.190218 267 282 308 
410 426-427 474 505 588 620 623 658 
692 713 737 779 790 862 874 878 887 
952 962-963 


skeletal muscle 


Clontech 


SKMs04 


215 


spinal cord 


Clontech 


SPC001 


14 20-21 25 28-29 31 39 46 48 59 78 83- 
84 91-92 103 112-113 135 160168 172 
176 188 1 90 205 209 229 232 258 285 
301 308 312-314321 323 329 346374 
377 380 383 388 394 398 406 409-410 
431 449-450 453 455 466-467 470-471" 
484-486 488 495 497 500 503 508-509 
524 537 539558 581 586 604-605 611 
61 9 623 630-63 1 633 .656 663 7 1 1 7 1 5 
729 736 740-741 76 1 767 769 776-778 
780 818 822 831 835-836 840 843 859 
861 871 875 887-888 897 906-907 913 
919-920 928 931 953 958 


adult spleen 


Clontech 


SPLcOl 


3 6 12-13 66 130-131 178 365 403 431 
461 558 610 715 797 809 876 947 967 


stomach 


Clontech 


STO001 


35 114 130-131 144 155 176 189 206-207 
249 260-262 336 382 398 425 431 453 
461 483 496 500 527 530 580 642 657 
663 669 748 765 768 802-803 839 891 
942 981 


thalamus 


Clontech 


THA002 


30-32 48 66 109 127 130-131 135 142 
145 156-158 168 172 174 185 199 224- 
225 233 246 277 282 286 293 322 332 
334 346 374 384 400 402 420 424 435- 
.437 446 466-467 485 503 506.527 542 
549 572 612 615 622 624 633 643-644 
658 676 736 790 794 824 831 835 896 
907 950 969 


thymus 


Clonetech 


THM001 


10 16 20 28-29 32 37 41 52 57 66-67 74- 
75 110 118 121 129-131 141 151 159-160 
208 21 1 21 8 247 269 289 295 297 320 
325 354 358 365 367 372 378 388-389 
395 398 411-412 420 423 435 452 500 
508-509 517 524 532 537 551 558 560 
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569 577-578 582 586 598 608 61 1 622 
643 684 715 721-723 728 740 766 772- 
773 795 834 837 849 864 885 900 921 
946 948 958 962-963 965 972-973 982 


thymus 


Clontech 


THMc02 


1 3 9-11 16 21 27 32-34 38-39 51 55-57 
66 72 74 77-78 80 82 89-90 101 112 115 
118-119 121 123-124 126 138 144 152 
159 168 174 176 178 186-188 197 200 
208 212-214 217 225 233 243-244 246 
254 256-262 279 282 285 288-289 296- 
297 313-314 322 334 343 354-355 358- 
359 363-364 367-368 372-373 382 387- 
389 395 400 402 41 1 414 426-427 437 
440 442 449-450 454457 462 464 469 
474 479 481 485 490-491 506 508-509 
511 517 522 526 528 532 542 551 554 
561^562 564 566-570 580-582 585 589 
597 599-600 602-608 61 1 613-614 619- 
621 625 628 630-631 644 646 655 669 
672 677 6,84 686-693 697 713 717 720 . 
728 740 746 749 760-762 767 771 775 
794 797 804 808 81 1 816 818-819 837 
840 859 880 883 887-888 896-897 903 
908-91 1 913 916 924 936 947-948 950 
962-963 965 967 970 


thyroid gland 


Clontech 


THR001 


3 8-9 14-15 19-22 28-29 39 41 55-56 66 
69 71-72 78-79 97 104-105 109 113 115 
119 121 123-124 130-133 135 138 143- 
144 146 148 151-152 156 159-163 165 
168 172 174 177 183-184 196 199-200 
203 209 21 1 215-21 8 228-229 232-236 
244 254-255 258 273 282 290 292 294 , 
297 303-306 308 311 317-318 322-323 
325-326 334-335 340 342 348 354 358 
373 377 381-382 387 394398 401-402 
405-406 409-412 416 422 425-427 429- 
431 440 449-453 462 466-468 474 478- 
479 481-484 490 492-496 500-501 505- 
506 517-518 522-525 532 537 540-541 
545 551 558 560 563-564 580 583 587- 
589 593 597 599 606-607 610 617-621 
625-628 633 635 641-643 658-659 664- 
669 674 682 686 688-691 696 699 715 
724 730 740 742-743 747 750 752 759 
761 765-766 768-769 779 789 796 802- 
803 813 818-819 822 831 837 843 845 
848-849 862 864 868-869 871 874 876- 
877 887 893-894 896-897 907-909 912 
919-921 923 925 928 936 940-942 944 
946-947 950 953 955 958-959 962-963 
967 969 973 981 


trachea 


Clontech 


TRC001 


33-34 55-56 69 74 1 63 172 190 209 212 



126 



WO 01/57190 



PCT/USO 1/04098 









267 270 297 305 314 352 413 426-427 
466-467 500 502 504 580 586 610 613 
633 642 688 691 71 1 724 738-739 774 
782 816 820 839 848 862 868-869 914- . 
915 928 968 


uterus 


Cloatech 


UTR001 


4 918 37 63-64 74 108 114-115 130-131 
160 166 179 184. 190 209 233 249 269 
285 301 314 327 337 348 384 394 399- 
400 403 406 411 425 431 434 437 440 
462 474 485 490 508-509 526 532 579 
617-619 636 642-643 672 761 769 793 
837 849 864 887 903 906 928 934 947 
967 



TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
. NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


1 


L06175 


Homo sapiens 


occurs in MHC class I region; ORF 


308 


98 


2 


Y70775 


Homo sapiens 


Follistatin-related protein zfsta. 


3094 


98 


3 


X15187 




nrppurcrn* r^r»l vrnPTiHHp ( A A J~) 1 tn 

782) 


41 1? 


inn 


4 


AF1 10640 


Hntnn Cfirtifnc 

JL J.U1JU.W oClLJIClJd 


nmhan cpvpn-irjiTiCTTK^rnhfjinp 

receptor 


T.AA 


mo 


5 


G03798 


Homo <;ar>ieTK 


Human iprreted nrotein SPO TD 
NO: 7879. 


158 


7? 


6 


W85607 


Homo sapiens 


Secreted protein clone da228_6. 


1477 


100 


7 


Y30162 


Homo sapiens 


Human dorsal root receptor 4 

hDRR4 

lLLsS\J\rt. 


884 


. - 88 


8 


Y 15227 


Homo ^aT>i(*n^ 


Leul 


391 


ion ■ . 


9 


Y28817 


Homo sapiens 


pt326_4 secreted protein. 


3338 


100 


10 


X92106 


Homo sapiens 


bleomycin hydrolase 


2445 


100 


11 


Y15228. 


Homo sapiens 


Leu2 


445 


100 


12 


U27838 


Mus musculus 


glycosyl-phosphatidyl-inositol- 
anchored protein homolog 


432 


34 


' 13 


U27B38 . 


Mus musculus 


glycosyl-phosphatidyl-inositol- 
anchored protein homolog 


320 


27 


14 


Y71062 


Homo sapiens 


Human membrane transport protein, 
MTRP-7. 


2323 


99 


15 

> 


U96781 


Homo sapiens 


Ca2+ ATPase of fast-twitch skeletal 
muscle sacroplasmic reticulum, adult 
isoform 


5145 


100 


16 


M16653 


Homo sapiens 


pancreatic elastase IIB zymogen 


1435 


99 


17 


Y13398 


Homo sapiens 


Amino acid sequence of protein 
PR0346. 


1749 


99 , 


18 


Y02283 


. Homo sapiens 


Secreted protein clone br342_l 1 
polypeptide sequence. 


1399 


99 


19 


Y53030 


Homo sapiens 


Human secreted protein clone d24_l 
protein sequence SEQ ID NO :66. 


1371 


100 


20 


AL031320 


Homo sapiens 


dJ20N2.5 (novel protein similar to 
fiicosidase, alpha-L-1, tissue (EC 
3.2.1.51, alpha-l-fucosidase 
fiicohydrolase)) 


2597 


99 


21 


B013S4 


Homo sapiens 


Neuron-associated protein. 


1876 


100 


22 


Y68778 


Homo sapiens 


Amino acid sequence of a human 
phosphorylation effector PHSP-10. 


2470 


100 
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SEQ 
ID 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


91 


Y55935 


J-TnmA QanipnQ 


T-flimSin T*THQ7 r-\T"/-vt/ain 

.n union jvji jz, proicm. 


47R1 


no 

99 


94 


L JJ7JJ 


Wnmn canipnc 
izuiiiu sapjLCiXD 


xiurndji jvnoz protein. 


zoU / 


1 Art 

100 


25 


AC024799 


C a en firh aHH iti q 
dedans 


f*Anf nine cimiluritn/ tr\ TT? */*^Q^riOQ 
wuiiiaiilb Sliiillaniy 10 1 ts..\Jyj\}^.y 




o i 
31 


26 


Y07972 


787 


Mliman QPPr^fprl nrnfpi'n fr^nrmpnf 

X AU 1 1 l&l I 3^/1 CiCU piV/LCU.1 LI QHi IICI 1 L 


X JHV/ 


1UU 


27 


X97630 


Homo sanien^ 


<!f*rinp/tnr(*nniTip nrntpin VinacA 

flvl UJb/ LI 11 pil/LClll iVlIlaoC 


^7R1 


y<j 


28 


AF150755 


"Mil*; miicrnliK 


llllLfl ULUUUlCauillJ UUSoiiilKUjg ldvlOl 


DDI** 


Do 


29 


AF1 50755 


Mus mn renins 


THlPrntllVnilp-flr"tiTi PrncclinVino' faf*frvr 
lllitvl UU1UU1C aULUl Ul uooiimxiiiH laLfLvr 


379^ 




30 


Z38011 


IMliQ mn cn iluc 

I'lUi lAJLUdl/LUUO 


J— 'IVLTv IN 7 


Z700 


OO 


31 




i-Tattia canipnc 


axoneiiid.i □yncm ncavy en am 


£ACO 

OUDo 


99 


32 


AF037256 


Mus musculus 


ES2 protein 


2260 


91 


33 


S62140 




1 JL/O iiuuicdr ivin A-ouiQing proiem 






34 


S62140 


1 jiuiiiu sapiens 


1 i^o iiuuicaT i\jn.t\- Dinning proiein 




• AO . 

98 


36 


AB038237 


Homo sapiens 


G protein-coupled receptor C5L2 


1767 


100 


**7 


,1-/ /777H 


xiomo Sapiens 


similar to ankyrin of Chromatium 
Yinosuni. 


. 6089 


99 


38 


X633R0 




seriiin response iacior-reiateQ proiem 


lyoo 


AA 

99 


39 


AL022072 


R f* n 1 "7 a c a f* c \\ si t 
UllljrCGa puiiiuc 


npuic aciu synineiasc 


1U0/ 


61 


40 


J03930 


Haiti n esmipne 
numu oapiciis 


alKalXnc piJOSpnalaSC 




1 AA 
100 


41 


AF132968 


Homo sapiens 


CGI-34 protein 


1088 


98 


42 


AT 1 17637 


xiomp sapiens 


nypotnencaj protein 




100 , 


43 


AL021393 


Homo sapiens 


bK747E2. 1 (novel protein) 


1526 


100 




A.00UJ. 1 


Homo sapiens 




1886 


100 


4S 




Homo sapiens 


organic cation transporter; 50% 
SLmuanry to JL-4i$o4 {riu.^z iHj&yj,) 


2423 


100 


46 


W7£94^ 

vv / O^H J 


xlulllu Sapiens 


Fragment of human secreted protein 

ciic-uucu uy gene iy. 


1949 


100 


47 


Y41765 


I-Tattia cflnipnc 
ajlviiiu oapieiio 


nuuioij rivuiuoj pruLcin oequence. 




1 AA 


48 


AF097330 

ill VJ7 / JJXJ 




n i cjiiorj u e cnanij ei , po 4 xi i , l^jLJ v^*f . 


1 JID 


AA 

99 


50 


U09413 


Homo sapiens 


zinc finger protein ZNF 135 


1361 


57 


51 


AFOn"15t19 


nuiiiu sapiens 


Keratin 10 


2374 


100 


52 


W636P.1 

VV UJUO 1 


nuiiio sapiens 


Jtiuman secreieu proiem i. 


1 OOiC 

1326 


99 


53 


AB035303 


Homo sapiens 


cadherin-10 


4094 


100 


^4 


A 1 7079 


syninenc 

OUllblTUCl 


IVLKr-o 


485 


100 


55 


AT 191RQ7 


numo Sapiens 




1867 


100 


56 




numo Sapiens 


Jri i Kjvi cione /ooj protein 


818 


96 


57 


AF15101S 


Hnmn ennipne 
nuiiiu aapiciio 


T-T^PP 1 R4 

XTOJTV-* I OH 




"1 AA 
100 


58 


AF 125042 


Wattia csnipnc 
■Liuiiiu sapiens 


uispnospnaic j -nucicouaase 


IDOO 


1 AA 
100 


59 


AF 118670 


TrTomA c?inipnc 
1 ixJiiiKj oapiciio 


orpudii vj pruicin-coupicu receptor 


1 0*71 

iy /i 


1 AA 
100 


60 


X04494 


Rnmn wniPTic 
■lauiihj 1 dapicxid 


picouidui puiypcpuuc 


l yuj> 


1 AA 


61 


AF208865 


Wnmn canipnc 
x xxjiLLv iapicuo 


PDRF 


j26 


1 AA 


62 


D15057 


Homo sapiens 


DAD-1 


567 


100 


63 


AF260665 


Homo sapiens 


histone acetyltransferase 


1510 


100- 


(\A 
u*+ 


AF9An£fi^ 
/irzouuoj 


Homo sapiens 


histone acetyltransferase 


1429 


. 96 


65 


AJ277145 


Homo sapiens 


ras-related small GTPase RAB18 


1073 


100 


OD 




Homo sapiens 


Human secreted protein clone 

an i u / o iL protein sequence oiiv^ JJJ 

NO: 106. 


348 


100 


67 


' Y82744 


Homo sapiens 


DNA replication and repair 
associated protein (DRASP). 


1028 


100 


68 


Y44486 


Homo ^anien*; 


■ Hum aTl fiPR W rprpntfir "nnlvnpnHHp 
XJ.U111CU1 vj j. iv vv lewcpiLfi p*Ji y ucuLlUC. 


1 771 
1 /ZJ 




69 


AL031228 


Homo sapiens 


dJ1033B10.2 (WD40 protein BING4 
(similar to S. cerevisiae YER082C, 
M. sexta MNG10 and C. elegans 
F28DL1) 


3196 


100 
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SEQ 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 


% 


W 


NUMBER 






WATERMAN 


IDENTITY 


NO: 








SCORE 




70 


AJ276316 


Homo sapiens 


zinc finger protein 304 


1751 


52 


71 


Y 183 14. 


Homo sapiens 


paraplegm-like protein 


A 1 A £. 

4140 


99 


72 


AF1 57028 


Homo sapiens 


protein phosphatase methylesterase-1 


2017 


100 


74 


Y71082 


Homo sapiens 


Human B-aggressive lymphoma 


1765 


99 








(BAL) protein. 






75 


AF225420 


Homo sapiens 


AD025 


734 


100 


76 


X95235 


Homo sapiens 


transcription factor AP2 , 


217 


100 


77 


AF 108420 


- Takifugu 


1 -aminocyclopropane-carboxilate 


733 


56 






rubripes 


synthase 






78 


GO 1349 


Homo sapiens 


Human secreted protein, SEQ ID 


- 650 


99 








NO: 5430. 






79 


AL1 17635 


Homo sapiens 


hypothetical protein 


922 


99 


81 


Z85986 


Homo sapiens 


dJIOSKI 1.3 (similar to yeast 


865 


77 








suppressor protem SRP40) 






82 


A T-'T n't A 1 A 

AF183414 


Homo sapiens 


hemin-sensitive initiation factor 2a 


3231 


99 








kinase 






83 


001143 


Homo sapiens 


Human secreted protein, SEQ ID 


495 


98 








NO: 5224. 






o4 


UU39b5 


Homo sapiens 


N-ethylmaleimide-sensitive factor 


11 A A 

3744 


99 


or 

85 


Y17791 


Homo sapiens 


VAX2 protem 


1496 


100 


on 

87 


AF263538 


Homo sapiens 


growth differentiation factor 3 


1944 


99 


OA 

88 


Y 19757 


Homo sapiens 


SEQ ID NO 475 from W09922243. 


1361 


100 


O A 

89 


AF161493 


Homo sapiens 


HSPC144 


1185 


100 


90 


Ar 161493 


Homo sapiens 


HSPC144 


856 


100 


91 


B25780 


787 


Human secreted protein SEQ ID 


647 


41. 


92 


U57344 


Mus musculus 


Meis3 


1007 


89 


93 


AF172854 


Homo sapiens 


cardiotrophin-kke cytokine CLC 


1197 


98 


94 


AL390114 


Leishmania 


extremely cysteine/valine rich 


223 


29 






major 


protein 






95 


AB016886 


Arabidopsis 


contains similarity to adenylate 


287 


38 






thaliana 


kinase~gene_id:MCA23 . 1 8 






96 


AC005525 


Homo sapiens 


F22162_l 


1855 


96 


97 


B20997 


Homo sapiens 


Human nucleic acid-binding protein, 


3836 


99 








NuABP-1. 






98 


AJ006692 


Homo sapiens 


ultra high sulfer keratin 


507 


70 


99 


AF1 72264 


Homo sapiens 


Traf2 and NCK interacting kinase, 


6942 


99 






splice variant 1 






100 


L11239 


Homo sapiens 


homeobox protein 


717 


100 


101 


AC004890 


Homo sapiens 


similar to zinc finger proteins; 


2154 


98 








similar to AAC01 956 












(PED:g2843171) 






102 


AC003682 


Homo sapiens 


R28830 2 


1287 


48 


103 


AF201839 


Rattus 


dynamin Illbb isoform 


4270 


95 






norvegicus 








.104 


Y79510 


Homo sapiens 


Human carbohydrate-associated 


1394 


100 








protem CRBAP-o. 






105 


Y79510 


Homo sapiens 


Human carbohydrate-associated 


1209 


90 








protem CRBAP-6. 






106 


A T AA/n /I O 

AL096748 


Homo sapiens 


hypothetical protein 


1216 


100 


108 


X97260 


Homo sapiens 


Metallothionein 2 


381 


100 


109 


AL034422 


Homo sapiens 


dJl 141E15.2 (novel protein) 


433 


100 


110 


AF191338 


Homo sapiens 


anaphase-promoting complex subunit 
4 


683 


100 


111 


AL021712 


Arabidopsis 


putative protein 


185 


26 






thaliana 








112 


AF250138 


Homo sapiens 


small stress protein-like protein 


1063 


100 








HSP22 






113 


AL109976 


Homo sapiens 


dJ794I6. 1 . 1 (novel protein) 


4176 


99 


114 


Y36151 


787 


Human secreted protem 


668 


100 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


115 


AF1 10399 


Homo sapiens ■ 


elongation factor 


1 uuu 


1 OA 


116 


AF210317 


Homo sapiens 


facilitative f?1ueo<!e trarmnnTter familv 

member GLUT9 


2052 


00 


117 


Y73328 


Homo sapiens 


HTRM clone 082843 nrotein 
sequence. 




inn 


118 


X040S5 


Homo sapiens 


catalase 


2846 


100 


119 


AF147717 


Homo sapiens 


ubiauitrn O-terminfil Vivrfrnla<;p 
UCH37 


1 60S 


i nn 


120 


X73882 


Homo sapiens 


microtubule associated protein 


3801 


99 


121 


AC004882 


Horoo *iat>ieriQ 


similar to PA A 16R71 

(PID:g3255952) 




i nn 


122 


M93311 


Homo satiiens 


metallnthinnpin-TlT 




inn 
IUU . 


123 


G03827 


Homo sapiens 


Human secreted protein, SEQ ID 
NO- 7908 


557 


94 


124 


G03827 


Homo sapiens 


Human secreted protein, SEQ ID 
NO* 7908 

1 > 1\ - /. / 7UO. 


222 


53 


125 


AF232009 


Homo sapiens 


peroxisomal trans 2-enoyl CoA 

rpHiir'facp 


1565 


99 


126 


AB004906 


Tnomoea 

purpurea 


trpincnncaQp 




on 


127 


M60165 




gUaillilC iLUUJCUllUC"*UinUUlg 

rfePiilatorv nrntpin 7 




Aft 


12S 


Y10319 


Homo sanien*; 


carnitine parrfpr 


1 SQ7 


i nn 

IUU 


129 


U75467 


Drosophila 
melanogaster 


Atu 


937 


36 • 


130 


Z21507 


Homo sapiens 


human elongation factor- 1 -delta 


494 


87 • 


- 131 


Z21507 


Homo Qanipnc 


human plrmtrntirvn ■fbf'tnr-.l _/1e*l+ct 
imiiiau cnjilgcLLiuil laHUI~l -Hello. 




i nn 
IUU 


132 


Y58633 


Hnmn QanipriQ 


PrrttPlTl rPfTlilntrno {T£»t*ip pvnraccinn 
nuicm iGguiaimg gciic CApiCiMUIl 

PRGE-26. 


0 /4j 


i nn 
IUU 


133 


Y58633 


Homo sarnen^ 


PmtPiTi rpoiilntino ctptip PYnrpccinn 
1 1UIC111 1 &gUlalUlg gCHC CApiCbblUll 

PRGE-26. 


zlfi 1 ft 
*+olo 


o< 

7J 


134 


M13692 


PToTnn Qanipnc 


<u]jjja**i aLriu ^lyuupruLciij. precursor 


1 (\£Jl 
1 UD4 


OQ 

yy 


135 


U72970 


Sus scrofa 


calcium/calmodulin-dependent 

rvrntrMTi L'inacp TT lortfoi-m ooinma.P 
plULCLU iviiiaac 11 ibUlUJlil galllllla-JD 


2723 


99 


136 


G03213 


HYvmo Qfinipnc 


Rliman CPnrptprl n-rrv+Airt TP*i 

xiuiiiaii scurcicu pruicin, ojcaj jjl/ 
NO: 7294. 


>4 ^o 


i nn 
IUU 


137 


AC005102 


Homo sartiens 


cm all inrfnpihlp rvtnlfinp cuKfamilv A 
oxxiaxi 11 J u uvJUJfc lUiVlJJC' dUL/laJHJtlj' A 

member 24 


697 


OO 


138 


AF155648 


Homo saDien*? 


nutative zinc finppr nrntpfn 

pUUlUVt; 4*111 lr lUl^Cl UJ.ULC111 


Jo J J 


07 

yz 


139 


AF 14463 8 


Homo satiiens 


sohinffosine-l -nhn^nhatp Ivsqp 

o^fxixxx^vsxxx^i x lJlx\jjUlltXl& lyaoC 


9077 


inn 

IUU 


140 


AF152318 


Homo sapiens 


Drotocadherin cam ma A 1 


477R 


i nn 


141 


B08517 


Homo sapiens 


Amino acid sequence of a beta- 
tubulin antigen 


5841 


100 


142 


X56667 


Homo sapiens 


calretinin 




00 

yy 


143 


X92763 


Homo sapiens 


tafazzins 


1 6ns 

1 UUJ 


i nn 

IUU 


144 


Y95293 


Homo sapiens 


Human GEF containing NEK-like 
kinase substrate sGNK. 


400? 


yy 


145 


AF226046 ■ 


Homo sapiens 


GK003 


1 198 

I I/O 


ion 


146 


M22877 


Homo sapiens 


cytochrome c 


SS4 


yo 


147 


AJ272212 


Homo sapiens 


protein serine kinase 


2196 


10ft 
1 uu 


148 


AB026491 


Homo sapiens 


PICK1 


2114 


98 


149 


AB018580 


Homo sapiens 


hluPGFS 


1600 

I U77 


IUU 


150 


X91868 


Homo sapiens 


sixl 


1509 


100 


151 


AF^66505 


Mai<i milieu 1i i q 


ncpiiHnnrirJinp cvnthacp ^ 


7 1 


54 


152 


U29170 


Drosophila 
melanogaster 


ANON-23D 


883 


43 


153 


G04075 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 8156. 


567 


99 


154 


AY009128 


Homo sapiens 


ISCU2 


138 


100 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 

OtUKU, 


% 

IDENTITY 


1 04 


R9S67Q 


Hrtmrt C5*r*tiPTic 


TTiiirian cpprpfpH nrrttpin cpnupn^p 
iiuiiicui ae^iCLeu piULCiu oey[uciiuc 

encoded by gene 1 5 SEQ ID NO:68. 


760 


l nn 

1UU 




AR09011 S 


7R7 

/Of 


h r\rY\ ninmip r*f mnncp HHr-1 optip 1 Acr 


i*tuu 


inn 

1UU 


106 




iviUd lllUov'U.iUa 




7091 




1 07 


AT 1164^0 


T-TfYTT* f*l CCJT"\1 f*n c 

noino oapiciib 




619 


. i nn 

1UU 


198 


X56203 


Plasmodium 

lalCipalUJIi 


liver stage antigen 


512 


24 


199 


Y70775 


Homo sapiens 


Follistatin-related protein zfsta. 


2027 


63. 


9AA 
2\)\j 


VR7717 
AO /ZJ / 


riomo sapiens 


a-giucosiaase 1 


4447 


yy 


9A1 

ZU1 


AP1 HI A7$l 
AT I U1U /o 


i^aenornaDonis 
elegans 


r*T TT 1 


1 101 


AA 
40 


■ 9A9 

ZUz 


AU4 J / 1 


Homo sapiens 


precursor polypeptide (AA -22 to 
1185) 


001 1 


1 AA 


9AQ 
zoo 


AUU4 /4 


Homo sapiens 


pS2 precursor 


4££ 


1 AA 


OA/1 

ZU4 




Halocynthia 
roreizi 


tt_t)CT 1 


y /■+ 


D4 


90^ 


AT iHOUiy 


nonio sapiens 


nepdLuociiuiar carcinoma anugen 
gene jz-vj 


0QR 


inn 

lOU 


206 


AF071002 


Homo sapiens 


minK-related peptide 1; MiRPl 


632 


100 


907 


APJ11R1 £9 


xiomo sapjciis 


ueioji iacior z 


744 


inn 


9flR 
ZUo 


TT1ft^71 


nomo sapiens 


PI 1 1 T-TT TWA 


161 


inn 

1UU 


9AQ 

zuy 


/vuuuuy 1 1 


Sus scrofa 


riuosomai protein 


7R9 
/ oZ 


i nn 


210 


' AB021227 


Homo sapiens 


membrane-type-5 matrix 
mexaiLoproie lnose 


3545 


100 . 


7 i i 

Z 1 1 


API RfiQ7fl 


rioino Sapiens 


cycim l, ania-oa 


9799 
z szz 


inn 


9 1 7 
Z 1Z 


api n^i£^ 


nomo sapiens 


rv-i^i couansponer iv^^*f 


^^94 


i nn 


71 "5 


. UZyZ'f'J- 


waenornaoaKis 

pil OfT5>T1 C 

cjtcgaiiD 


sunuar 10 numan ^1 ivt ) u ansioi iiimg 
nmtpin rPTT? * ^99 1 S71 


609 

ouz 


19 
jZ 




AT OII^IR 


AiUlllU 0 dpi Clio 


rTT477R91 1 fnnvpl nrntp^n^ 

LlJ^t / /nzj.l ^IlUVCi piULOilly 


110S 


inn 


91 ^ 




T-TnTTin csnipnc 


ttiiicpIp HptprminnHnTi fiaptnr 

UlUdUie UCLCl 111 11 ldLlUll XaWLUl 


1262 


inn 


216 


AF083248 


Homo sapiens 


ribosomal protein L26 homolog 


739 


100 


717 
Zl I 


APAA£7^1 


nomo sapiens 


Ho/ IjU 


4701 


00 

yy 


91 R 

. Zlo 


J\i>\J\J / o_>y 


nomo sapiens 


VTA A 0100 r\rrkfpin 

jvi/t-tvu J?? pruicin 


i^^o 
jy 


00 

yy 


910 

z 1 y 


AT<T 0.76901 
/YJvUZOZy 1 


nomo sapiens 


unudmcu prove in prouuci 


R76 
oZO 


inn 


991 
ZZ1 


J 04U4 J 


nomo sapiens 


opiicc van ant 01 cancer associaiea 
polypeptide CHl-9al 1-2. 


DoD 1 


07 

y / 


997 
zzz 




xionio Sapiens 


ieriaacin-xv ^resiriCLjn^ 


71 Rfi 
/ loO 


inn 

1UU 


791 
ZZ.5 


API 1/IRA9 


nomo sapiens 


cofilin i so form 1 


R4A 
OHO 


1 AA 


224 


Y17711. 


Homo sapiens 


atopy related autoantigen CALC 


1611 


99 


■ 99^ 
ZZj 


ATI nfinc 1 
AT IVUUj 1 


G alius gallus 


hepatocyte nuclear factor la 
dimerization cofactor is o form 


44j 


61 


996 

ZZO 


AK" 0.969*16 


XiUlllU bapiciio 


uiuidmco proiein proQiici 


R66 


OR 

yo ■ 


997 
zz / 


76Q16R 
Z1D7300 


^ V* 1 *7 c a r*f* n s r 
0 vJLL LcUbatUJ ai 

omvrpQ nATTiKp 


ii u.ix. njvc cuiicu-vuii pro lc iii 


9in 


9S 

ZJ 


228 


AF275948 


Homo sapiens 


ABCA1 


11763 


99 


990 

ZZ7 


AP1611R4 


PTom c\ Mnipnc 
numu aapieiid 




2006 


OR 


. 9in 

ZJU 


VI 697fl 
I 1 DZ / U 


JTlulIlU dapieild 


parol ciiiin 


1 0^1 


inn 


231 


AJ245599 


Homo sapiens 


putative secreted ligand 


2379 


99 


919 
ZJZ . 




nomo sapiens 


numan siomacn carcuioma cione 
HP 1 04 12-encoded protein. 


1 ^4^ 
1 jHj 


00 

yy 


911 
Zjj 


APA0£9R£ 


ivLus muscuius 


pecan ex i 


1^91 


01 

yj 


914 


\T6A6. 1 0 

v O40 1 y_jcu 
1 

■ 1 


nomo sapiens 


ju-in \j v- 1 yyu numan iid i cljjn a. 


70^ 

/yo 


1 AA 


235 


V64619 cd 

1 ' 


Homo sapiens 


30-NOV-1990 Human HE1 cDNA 


470 


98 


236 


AF227258 


Bos taurus 


RPGR-interacting protein-1 


1262 


38 


237 


AJ132445 


Homo sapiens 


claudin-14 


1181 


100 


238 


AL034562 


Homo sapiens 


dJ684024.2 (prodynorphin (Beta- 


1330 


100 
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WO 01/57190 



PCT7US01/04098 



SEQ 
ID 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 


IDENTITY 


155 


AF141315 


Homo saoiens 


alnha-1 4-N- 

acetylglucosaminyltransferase 


1842 


i no 

1 vu 


156 


AF1 10645 


Homo saoiens 


candidate tumor sunnre<i<?or r»33 
ING1 homolog 


17Q4 

l<67t 




157 


AF159297 


Zea mays 


extensin-like protein 


238 


25 


158 


AL133325 


Homo sapiens 


dJ984P4 3 fHomeobox nrotein 
NKX2B) 


1437 


inn 


159 


AF073298 


Homo sapiens 


small EDRK-rich factor 2 


294 


100 


160 


AC004858 


Homo lam'en** 


Til imall riboTiiirlpnnrfttpin 1 ^"WT?P 

homolos* match to PED'e4050087 




i on 


161 


AB012109 


Homo sapiens 


APC10 


990 




162 


AL162751 


Arabidopsis 
thaliana 


putative protein 


194 


32 


.163 


AJ005698 


Homo ^aniens 


nolvf A\-^rK*rifir riKnnnrlpnQP 


3^<i1 


l no 


164 


AF1 17646 


Homo saoiens 


Iotip GRl^^ rtrntpin 

vJJLrJ pi U Lb ILL 






165 


AC004002 




oiiiiiiaJ lu binary uyiicui ucia ncavy 
chain* 78% Similarity to P2309R 
(PIDrgl 18965) 




i on 


166 


Ml 0942 


Homo sapiens 


human metallothionein-Ie 


381 


100 


167 


AF126484 


Homo sapiens 


CARD4 


4961 


100 


168 


AF161518 


Homo sapiens 


HSPC169 


1604 


100 


169 


M64983 


Homo sapiens 


fibrinogen beta chain 


2482 


100 


170 


M64983 


Homo sapiens 


"fibrinogen beta chain 


2679 


100 


171 


M58514 


Gallus gallus 


fibrin no pn bpta rhain 

1.1L/JL 111 Vgwll L/wl-CL lrl_LuUi 


1 05Q 

I UJ7 


7R 


172 


AF078845 


Homo sapiens 


16.7Kd protein 


786 


100 


173 


ACO 04774 




Dlx-6 




i nn 


174 


Z98974 


Schi70sacch ar 
omyces pombe 


rmtntivf* vacuolar *nrotpin crvrtintr- 

|/UlaLIVb Vo^LiUlal LUUICLU OvJ UULi t 

associated protein 






175 


X56203 


Plasmodium 
falciparum 


liver stage antigen 


283 


23 


176 


W74726 


Homo sapiens 


Human secreted protein fg949 3. 


lO (7 


100 


177 


AJ222967 


Homo sapiens 


cvstinosin 

V J OLUIUiJ 111 


1 070 


100 


178 


AC024796 


Caenorhabditis 
elegans 


contains similarity to TR:076167 


221 


27 


179 


Y66632 


Homo sapiens 


Membrane-bound protein PR0276. 


1370 


100 


180 


AF151803 


Homo i aniens 

XX\J1L1\J iSuUlbUO 


CGT-4^ nrotein 




9R 


181 


G02694 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 6775. 


283 


100 


182 


Y17292 


Homo *ianiens 


Human cpW HMth nrpvpntintr VfnacA 
flyPTC- 1 ^ nrotein seauence 


967A 


i nn 


183 


AF234765 


Rattus . 
norvegicus 


serine-arginine-rich splicing 
regulatory protein SRRP86 


148 


27 


184 


AF151855 


Homo saoiens 


CGI-97 nrotein 


1914 


70 


185 


AF289664 


Mus musculus 


CYLN2 


4673 


90 


186 


AL022238 


Homo sapiens . 


dJ1042K10.2 (supported by 
GENSCAN FGFNE*? and 
GENEWTSE") 


4059 


100 


187 


AL022238 


Homo saoiens 


dJ1042K10 2 (sunoorted bv 
GENSCAN, FGENES and 
GENEWISE1 


9^^9 


l no 


188 


X83543 


Homo sapiens 


APXL 


8513 


99 


189 


AF059569 


Homo saniens 


aetin binding nrnfpin A/rAYATP*NJ 




oo 


190 


M18135 


Rattus 


smooth-muscle alpha tropomyosin 


1306 


95 


191 


AF242194 


Drosophila 
melanogaster 


brakeless-B 


147 


52 


192 


D30689 


Bacillus 
subtilis 


subunit of nitrite reductase 


113 


. 29 


193 


Y44984 


Homo sapiens 


Human epidermal protein-1. 


538 


97 
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SCORE 










Neoendorphin-Dynorphin precursor, 












Proenkephalin B precursor)) 






239 


AF262027 


Homo sapiens 


eIF-5A2 


808 


100 


240 


AL079344 


Arabidopsis 


putative protein 


194 


.33 






thaliana 








241 


AC002394 


Homo sapiens 


Gene product with similarity to 


1542 


51 








dynein beta subunit 






242 


AJ271361 


Takifugu 


FRANK2 protein 


303 


30 






rubripes 








243 


AL021918 


Homo sapiens 


b34I8.1 (Kruppel related Zinc Finger 


1476 


48 








protein 184) 






244 


AP190167 


Homo sapiens 


membrane associated protein SLP-2 


1736 


99 


245 


Y10601 


Homo sapiens 


ankyrin-like protein 


5877 


100 


246 


. AL121771 


Homo sapiens 


dJ548G19.U (novel protein 


3628 


100 








(ortholog of mouse zinc finger 












protein ZFP64) (translation of cDNA 












NT2RP3001398 (Em:AK001596)) 












(isoform 1)) 






247 


L25314 


Drosophila 


actin-related protein 


984 


47 






melanogaster 








248 


X63745 


Homo sapiens 


KDEL receptor 


1095 


100 


249 


AF 112208 


Homo sapiens 


13kDa differentiation-associated 


816 


100 








protein 






250 


AP001707 


Homo sapiens 


human gene for claudin-8, Accession 


1172 


100 








No.AJ250711 






251 


AL136125 


Homo sapiens 


dJ304B14.1 (novel protein) 


778 


100 


252 


AL031186 


Homo sapiens 


bK984Gl.l (supported by FGENES) 


532 


100 


253 


Y17531 


Homo sapiens 


Human secreted protein clone BL205 


639 


100 








14 protein. 






254 


AL04984> 


Homo sapiens 


dJ392M17.3 (KIAA0349 protein) 


6741 


99 


255 


A TO. A ^AT*"* 

. AJ242972 


Homo sapiens 


TOLLIP protem 


1424 


99 


256 


Y94873 


Homo sapiens 


Human protein clone HP02632. 


1876 


100 


257 


AF279865 


Homo sapiens 


kmesin-like protem GAKIN 


2903 


100 


258 


AL024498 


Homo sapiens 


dJ417M34.1 (novel protein) 


589 


100 


259 


R66278 


Homo sapiens 


Therapeutic polypeptide from 


830 , 


100 








glioblastoma cell line. 






260 


AF101784 


Homo sapiens 


b-TRCP variant E3RS-IkappaB 


3226 


99 


261 


AF101784 


Homo sapiens 


b-TRCP variant E3RS-IkappaB 


2821 


100 


262 


AF101784 


Homo sapiens 


b-TRCP variant E3RS-IkappaB 


3149 


99 


263 


AF1 97060 


Homo sapiens 


src homology 3 domain-containing 


2257 


100 








protein HIP-55 






A 

264 


Y86262 


Homo sapiens 


Human secreted protem HAQAR23, 


766 


100 








SEQ ID NO: 177. 






265 


Y56966- 


Homo sapiens 


Human SBPSAPL polypeptide. 


2779 


100 


266 


Y56966 


Homo sapiens 


Human SBPSAPL polypeptide. 


1018 


99 


267 


A J3 00465 


Homo sapiens 


putative white family ATP-binding 


1557 


95 








cassette transporter 






268 


AC004030 


Homo sapiens 


F21856 2 


3579 


99 


269 


X55954 


Homo sapiens 


HL23 ribosomal protein 


714 


100 


270 


AB033921 


Mus musculus 


Ndrl related protein Ndr2 


1855 


94 


271 


AF081886 


Homo sapiens 


EROl-hke protem 


1905 


99 


212 


AF 166492 


Homo sapiens 


small GTPase RAB6B 


1060 


100 


213 


AL022238 


Homo sapiens 


JT1 A/IVIrt A f . — 1 a. * \ 

aJ1042K10.4 (novel protem) 


2201 


100 


274 


W88667 


Homo sapiens 


Secreted protein encoded by gene 


1530 


99 








1.34 clone riAi±$ro9. 






275 


X00129 


Homo sapiens 


precursor RBP 


1044 


97 


276 


Z47500_cdl 


Homo sapiens 


1 1 -MAY- 1998 Human RHOH gene . 


1161 


100 








sequence. 






277 


AB049188 


Equus caballus 


ubiquitin C-terminal hydrolase 


1118 


96 
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SCORE 




278 


AF270647 


Hnmn ^anipn*; 


GTT1 


1564 


100 
1 uu 


979 


AF143956 


Mhq miicnilnc 


rnrnnin- 9 

L«U1U11111 


2414 


04 

y 6 * 


280 


R85151 




FnHntTiplial ppII rml vnprvHHp 
xjjiucLi j&iiai icu puivpepLiue. 


01 1 

7 11 


07 

yz 


7R1 

ZO 1 


P.RS1 SI 


numo sapiens 


onuouienai ecu poiypepuue. 


1 03 1 


1UU 


7R7 


r>R304R 


JVaLLLLb 


oi-i proiem 


jy ID 
















7R3 


V1476R 

lit/ uo 


l-JfWn/*v com one 

nwiiu Sapiens 


1 Jvappa JJ-ILKC piOlcm 


7037 


1 f\f\ 


7R6 


AT 03131 6 


nujjiu Sapiens 


HT9RniO 3/Tf^m 1P1 


704 


i nn 








^liyUl UAjrSLClUlU \ L l"UOLa.y 












HpVivHmtypTiacp 1^ 
uenyui Ug^cnase i } 






7R7 


T"*6410Q 


"I! : 

Homo Sapiens 


tob family 


1 77*3 
1113 


nn 


9R8 


AR096043 


numu sapiens 




1730 


iUU 


7rg 

Zoy 


1 R66 
1V10 1 000 


Homo Sapiens 


Krueppel-related DNA-binding 




90 








pruLcni ■ - 






7Qn 

zyu 


A Tnn 1 G 1 (\ 
AJUUlolU 


— : 

Homo sapiens 


mKNA cleavage tactor 1 25 kDa 


1217 


100 








suDunii 






901 


V004S4 
i yy*T j*+ 


numu sapiens 


TTnman PT>O160^ /^TM^ , ^7^i6^ smmn 
nUulall i^ivVJIuUj ^UINl^/oOJ aTnuiO 


0y*+ 


i nn 








arid cpnnpnrp ^"PO TT) Ts7O*30^ 
o-l/IU SC^UeilvC OJCiV^ WJ 1NV^.37J. 






909 


V44R94 


nunio sapiens 


numon moiecuie associaieu wixn ceu 




IUU 








n,*v"kl iTOt'Qfin'n ftAAf^P ^ 

pruuxcrauun, lVLriv^r-H. 






293 


A T976 1 0 1 


rTrvmn csinipnc 
jtivjiilu oapieiis 


VJ JT rvoj O piULCUl 


7000 


l nn 


904 


API 61/106 


numo Sapiens 


T-TCPPORS 
nor woo 


71 O 

/ \y 


i nn 


zyj 


VCC/C7C 


Homo sapiens 


Protein regulating gene expression 


1276 


100 








Jr KVjJd-z I . 






906' 


T 101 c/;i 
U" 1 JO 1 


P ofhic 

iVallUS 


pyridoxine 5'-phosphate oxidase 




o / 






11U1 YCglCUO 








907 


T 090S6 
i_a/x? ju 


yvciiupus 


j iDunuLieupruLciii 


1 674 


CI 






lsevis 








90R 


A "[77767^ 


Homo sapiens 




1 77Q 
1 12.y 


nn 


900 


A F976730 
/vrzzo / jvj 


numo sapiens 


Ksylly 




no 
?o 


300 
jUU 




Homo sapiens 


Amino acid sequence of a human 


/Jo 


on 

89 








gastric cancer aniigen proiem. 






301 

J\f 1 


AF17<,^.33 


_ — . : 

Homo sapiens 


NADH-cytochrome b5 reductase 


louo 


i nn 








isoionn 






309 


V39706 


T-T/Vrn.*"k ponionc 

numo sapiens 


Uiimon ra^Antni* TY\rtlor»nl« /"DT^/^N' 

nu in oil rccepior moiecuie ^iviiw ) 


1 A76 
10/0 


ys 








ciivuueu uy incyie tiunc zozjozo. 






303 


AT7747^6»*; 


— : 

Homo sapiens 


hepatocellular carcinoma associated . 


CO c 


i nn 
100 








i*i n cr fi n frPi* nrntPTn 

ling Linger pruicni 






304 


AF90RR44 


numu sapiens 


UlVf-009 

DIYl-UUZ 


■+ZO 


i nn 


305 


AP0040R3 


nuuiu sapiens 


b mi liar LU riU.gOo / fy***T 


1 0CR 


i nn 


306 


AT 1 37Q7R 


rM aUlUUpSiS 


puiaiive proiem 


7 1 f\ 


7< 
Z3 






uiailaila 








307 


Y 10530 


a i uii i vj oap j en^ 


nlfarfrnrv rprpntnr 
uLiauiLu y i cueptuj 


1 64S 
1 U*f J 


i nn 


308 


AFl R06R1 


f-Trtmrt poni t*T\ c 

numu sapiens 


guanuie uucieuuue exenange xaexor 


j jy / 


i nn 
IUU 


300 


AFl 1 1 RS6 

.TiX 111 OJD 


numu sapiens 


suuium ucpenuciiL pnuspiidic 


7*^01 










tmnCT^nT+pi* ic/rfnim XT^Pi— 3K 
uaiiopuiiei isuiui ui lNair 1 J u 






310 


Y13583 


TTmnn Qnntpnc 


fr— rviYvt'Pin /*aiit^1 pH rp r^prv tftr 
\j~piuLcm cuupieu icuepiur 


7171 
Z 1 / I 


i nn 


311 


273420 


T-Tomn wfiiptiq 

i xvjixiyj oapi&iio 


rF146F)10 7 /'mprrnntrinvnixfatp 
v^-Lj i *tui/ iu.<6 yiii&i uap iupy mv ate 


1 ^0R 

1 J70 


i no 

1UU 








<?ulfnrtran<:fpra$fi CFC 9 R 1 9Y& 






312 


X79535 


T4nm/S cjmipnc 
iivjiuu dapiciio 


ucta Luuuini 


734R 


i no 

IUU 


313 


AF0706SR 


T-Iorn n cnnipnc 
nuiiivs oapieua 


P^PP009 


R61 
oO I 


i nn 

IUU 


314 


AF07RR66 


numu sapiens 


OUX\X t 


1 J7J 


i nn 
1UU 


j i / 


7370R6 


numo sapiens 


pnenyiaiKyiamme Dinaing proiem 


1 7C5 


i nn 
IUU 


320 


AR047RQ7 

ADuH / 0 JZ 


lviaCaCa 


nypouieucai proiem 


7<C 


CO 
oZ 






fascicularis 








321 


Y25755 


Homo sapiens 


Human secreted protein encoded 


1440 


100 








from gene 45. 






322 


AB016531 


Homo sapiens 


PEX16 


1741 


100 


323 


AL391141 


Arabidopsis 


putative protein • 


274 


49 
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% 


ID 
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IDENTITY 


IMA* 








SCORE 








thaliana 








325 


AF140501 


Homo sanipn^ 


T)W A nn vmprncp irtta 

JL/IN/a J U1C1 OOC iULOi 


JU7 1 




326 


X96698 


Homo sanien*; 


D1075-like 


IH JU 


yo 




Apl V?39<i 
r\JT IjZjZJ 


xiomu bdpiens 


protocadherin gamma A5 


4/oy 


100 




AF1 S1K03 

/TJL 1 J JOV/J 




LuMj pr oiein 


1 OTA 

iy /u 


100 




V74n7fl 


jiumo bap i ens 


transcription ractor Blrl 


639 


81 


33n 


API 7i i n? 


riomo Sapiens 


retinal degeneration B beta 


1 DAI 

1302 


95 


jj 1 


VV j4U4U 


Homo sapiens 


Human interferon-inducible protein, 


484 


98 








tTTPT 

rlirl. 






jjZ 


a COO/l^ 1 T 

ArU^40i / 


Homo sapiens 


transcription-associated zinc ribbon 


691 


100 








protcm 






ill 

JjJ 


nioici 


Rattus 


Rabin3 


2129 


90 






norvegicus 










vjrUjo / / 


Homo sapiens 


Human secreted protein, SEQ ID 


621 


100 








"MO' 7Q^G 

invj. /yjo. 






jj j 


AT 




DiviZjriy.z ^ortnoiog oi a. tnaiiana 


ozo 


100. 














JJy 


AP1 10774 


nomo sapiens 


adrenal gland protein AJD-001 


647 


■ 

. 100 


337 


APtO1 141 A 


Homo sapiens 


Kruppel-type zinc finger protein 


1674 


58 


J JO 


AP7n7^nn 


Homo sapiens 


eth an ol amine kinase 


129 


100 


34n 


APf|0n^70 
riv^UZUj /7 


Arabidopsis 


putative 


3283 


50 






thai) ana 


phosphoribosylformylglycinamidine 












syntnase, J,jj\)y-J.yy5\) 






341 


V?R^7A 
I ZOJ /O 


— : 

Homo sapiens 


Secreted peptide clone pe503 1 . 


944 


100 




UjZ.Z/4 


Saccharomyce 


Ydr386wp; CAI: 0.12 


191 


37 






s cerevisiae 








j4j 


AH! T7 1 


synthetic 


vascular anticoagulating protein 


1661 


99 






construct 








j 44 


AtzzUUjZ 


Homo sapiens 


uncharacterized hematopoietic 


1285 


100 








stem/progenitor cells protein 












IVil-'oUjZ 






34^ 




— : — 

Homo sapiens 


Human cell-signalling protein-2. 


754 


100 


34A 

J HO 


v^no7A 


Homo sapiens 


Human fetal brain cDNA clone- 


962 


100 






— — — ■ — 


vcl6_l derived protein. 






347 

J*T / 


API R3470 


Homo sapiens 


zo.4 KJDa protem 


1329 


100 


34R 

J*TO 




Arabidopsis 


putative cleavage and 


1383 


55 






UlaLlana 


polyadenylation specifity factor 






34Q 


AT A37A31 


Caen orh abditis 


Y IUoCjdH.o 


194 


39 






elegans 








JJv 




Homo sapiens 


1 — 

Fas-hgand associated factor 3 


167 


23 


3^1 

J J 1 


VG3A£8 


Homo sapiens 


Amino acid sequence of a potassium 


1182 


92 






___ — . 


channel interactor protein. 






3S7 

JJi 


apoo^r^a 

/VTUUjojO 


Drosophiia 


anon2A5 


111 


45 ■ 






yaj\.u,Ua. 








353 




x AUiiiu oapicilb 


niyeioiQ i_//\.r ix-associaung lectm 


I U 13 


100 


3S4 

J J*T 


Apnooinn 


Homo sapiens 


WD-repeat protein 6 


2882 


99 . 


355 


UJl /JU 


i vi urine 


reverse transcriptase 


316 


42 






XCUACmia VITUS 








356 

JJU 


DS0617 

JJJV/O 1 / 


oaccnaromyce 


VPT C\A r )r y 

I JFJL/U4ZC 


279 


27 






o UCICVlMdv 








357 


D50617 


N ft S*f* ft U TYl TT1 \ tf P 


VPT A 47P 


/.fy 


27 






s cerevi<;iap 








358 


AF16143? 


xiuiiiu sapiens 


XJCPPII/I 
rlorCjl4 


1 ftCA 

1059 


93 


359 


AR02Q4RR 


nujuu sap j cub 


pi 1 nr-f7 1 

. ion/1 


756 


99 


360 


AJ251024 


Homo sapiens 


outative odorant bindino nrntpm no 


17^0 


i on 


361 


U43281 


Saccharomyce 


Lpg22p 


2074 


74 






s cerevisiae 








362 


U43281 


Saccharomyce 


Lpg22p 


2153 


74 






s cerevisiae 
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SEQ 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 


% 


ID 


NUMBER 






WATERMAN 


IDENTITY 


NO: 








SCORE 




363 


AC007153 


Arahidnnsis 

■ill O.U I K1\J LI O lO 


100632 


156 


9/1 






thaliana 








364 


AF 197927 


Homo saDiens 


AF5a31 nrotein 


3992 


00 


365 


D28500 


Homo saniens 

IJ.\JXIX\J JUUJMIO 


mitochondrial i*;o1piirine tRMA 


4286 


OR 








synthetase 






366 


X97868 


Homo saniens 


arvlciilnhatAQp 


3141 

J i*T 1 


OR 


367 


AL 162048 


Hnnin <;anipn« 


hvnnthpricnl nrrvfp in 


1537 


1 oo 
1UU 


368 


L36062 


Mus mimeulu^ 


ctpmiHnffpnir striitp rpoiilatrtrv 


180 

107 


7^ 
Zj 








nrotein 






369 


AF1 13740 


HrnnA canipn c 


iliUlLipXC UL/IilaLlI pULaLiVC uuctCar 




<0 

jy 








nrotein 






370 


M15888 


Bos taurus 


endozenine-related nrotein nrecur^or 


2425 


R4 

OH 


371 


X66363 


Homo saDiens 


serine/threonine nrotein kinase 


2562 


1flfl 

Ivv 


.372 


W74802 


Homo sapiens 


Human secreted Drotein encoded bv 


1532 


80 








eene 73"clone HSOEL25 






373 


AF 100772 


Homo saniens 


tpn a <ir in -A^ 1 


1 1 535 

i X J J J 


00 


374 


AF090934 


Homo saniens 


PRO05 1 8 


389 
J oz. 


ion 

1UU 


375 

J / J 


AB021643 


Hnmn canipnc 


iTrttlsdAtrrtT^tri inHiif*fHlp tFStnerri'n+irtTi 
^UXlaLlUU UfJUl I11UUUJUIC U allSUl ipilUil 


Z /Ol 


oo 








renressor-3 






376 


AB049758 


Homo saniens 


MAWn hind in <r nrntpin 

JVJLTx VV VJ UiilUXJJg piULCiil 


1 33 1 
l j j i 


* - inn 

1UU 


377 


AF070666 

ill V f uWuu 


Hntnn canipnQ 


If rnnnpl-nCQrtpintf»H hr»v nrr\+Afn 
Ivl uppci aojULlalCU UUA piUlClii 


HDD 


07 


378 

j / o 


5559342 


xvxua oiy. 














n/S2 

r 






379 


AF 149205 






ID7U 


RR 
oo 


380 


AF227006 


HYvmrt cnriipnc 


• uj^Jr^giuuuse. glycoprotein 


/OJl 


oo 
yy 








ahicnQvlrrancfpracp 7 nrpfnrcnr 






381 


AF1 18566 


TYlllCf*1llllC 

iviua iiiUaCUlUa 


neixiaLupoicuc zinc linger proiein 




OT 

yz 


382 


AKOOOfilO 


T4Vvmn canipnc 
rxuiiiu aapicxia 


uniiaiiicu proiein prouuci 


olU 


1 oo 




AF779QO< 


Homo sapiens 


UDP-glucose:glycoprotein 


7851 


99 








crliir*rtcvllxJiTicfp>T*QCA 7 T^rpnirc/^r 

giiiLAjoyju aiibicxaoc z. prccursui 






384 

oof . 


AFT 17046 


MVtms\ pinion 

riuiuo sapiens 


Link guanine nucleotide exchange 




1 Art 








■fn^t/M* TT 
laCLor ii 






JOJ 


AF195300 


xJx Uoupiiua 


1 R7fl 


i jy 


A 1 






Tnplannoa<itpr 








386 


YQ40fV7 


jiumo sapiens 


riumon secreted protein cione 


1 ooo 


jU 








cai I/O iy a protein sequence ojea^ iu 












\in-7n 

1NW.ZU. 






387 
jo / 


T 71 8705 


oaccnaruinyce 


i eiuo^fcp 


zuo 


Z5 






c f*prpviciup 








3RR 


AF1773RR 
/IT 1 / / jOO 


xioiuo sapiens 


rrp— -j : : ■ 

cancer-am put led transcriptional 




yy 








pnsiptivatnr A Qf" 1 — 7 






3RQ 


/WUUZ/HH 


— : 

Homo sapiens 


ujJr-uaiiNAc.poiypeptiae in- 


j469 


96 








fl^pfvloalfl(*tri<iamin\/ltT5tncf**T^»e<a 7 
auciyigaiauLU^alUiJiyiU aixoiciasc / 






390 


AF007366 


nuiuu sapiens 


cone soaium^caiciuni potassium 


3 loo 


1 OA 

100 








pYphanopr 






391 


AF2 17525 


Homo sanipns 

xxuixiu ^uxjlvllo 


T*iriwn-cvndrrimp y*<»11 ndhpeinn 


5337 
J jj / 










nriAlpnilp 






392 


U81035 


Rattus 


anlrvrin hindina r*pll ndhpcinn 
alXxVjrl Xll UiiiUlil^ LiCll aULliCoiUil 


30^7 
jyo / 


01 

y i 






norvegicus 


molecule n euro farcin 






393 


X65224 


Gallus eallus 


neurofascin 

UwUl V/XOOVfXXX 


4007 


7R 
/ o 


394 


X13916 


Homo saniens 


U"*yL,-recentor rplatpd nrernr^nr f A A 


4707 

T , i-7Z 


00 








-19 10 4575") 






395 


AF151083 


Homo saDiens 


HSPC249 


444 


OR 

70 


396 


ABO 17026 


Mus miisculii*! 

J.T1UO liiu JVUIUJ 


nvvQtprnl -hind in tr nrntpin 


7173 
Z 1 / j 


QR 

y© 


397 


AL035587 


Homo sapiens 


dJ475N16.4 (KIAA0240) 


2393 


100 


398 


W74813 


Homo sapiens 


Human secreted protein encoded by 


722 


92 








gene 85 clone HSDFV29. 






399 


Y71110 


Homo sapiens 


Human Hydrolase protein-8 


1637 


99 








(HYDRL-8). 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

roENTTTY 


400 . 


AF0397I8 


Caenorhabditis 


contains similarity to lupus LA 

nmtpin llf*7Tlrtlrt(TC 


325 


4 3 


401 


AE000877 


Methanotherm 

ODaClcr 

UXCl IIlUaUlUUlF - 

phicus 


conserved protein 


231 


36 


409 


Y9770S 


rx.UxJ.lU JwlUlCllD 


p-TiimftTi cpfrp+pH nrntpin PTirnnpfi twi 

XX LU 11 all dClrlCLGU UlUlClXl vilUUUCU l/Jr 
ppnp "Wn 70 


1 S30 

1 JJ7 


00 

yy 


403 


250853 


Homo cflnipnQ 

ll\Jlll\J daLJl&lid 


CLPP 


615 


100 


405 


X03475 


RattiK 
norvegicus 


rihncnmal nrntpin T-3^a 1-1 10^ 


576 


00 


406 


AF1 44237 


Homo sapiens 


LOMP protein 


252 


44 


407 


U20239 


Mus musculus 


fibrosin 


288 


76 


409 


AL033378 


Homo sapiens 


dJ323M4.1 (KIAA0790 protein) 


6026 


99 


410 


X54326 


Horno •Mini phi 


pliitaTninvl-tR*WA <;vntheta^e 

gxuirCuxxxxi y i ixvi^iiTi o yiiujwLujv 


7577 


99 


41 1 


X61585 


Rn<i tannic 


nolvnnHpotiHe arienvlvltraTTsferfl^e 


3715 


97 


417 


AF917190 


Hnmn QanipnQ 


"Ml FT 1 nrntpin 

ivi I .17. X_> i LJ1 VJ IC 1X1 


5271 


00 

yy 


414 


G02815 


Homo sapiens 


Human secreted protein, SEQ ID 
NO* 6R96 

liU. U07U. 


314 


95 


HI J 


AT245029 


Wnmn cnnipnQ 
rxUXllU aapiCfiio 


5ilr»Tin-tiihiilin R 
aipiio. luuuiui 0 


2370 


100 

X uU 


416 


AF203032 


tfnm n Qnnipnc 
Jlx\JLLI\J oapiwlla 


npiirii'filfl'mpnt nrntpin 

ilCUl Ullldilitill l fJlULVLll 


220 


21 


41 7 


707653 




r3R0Al 9 1 /n/ivpl nrntpin ^icnfrtrnri 

WJOUAl^.l ^llUVCl piUL&lll ^IMJiUllll 

D) 


1 567 


100 


41 R 

*+ 1 o 


AT404326 


T-Tnmn cnnipnc 




1R71 

1 o / 1 


00 

yy 


419 


AT404326 


Hnmn Qanipnc 




902 


64 


490 * 


AT 1 Jt/LU 


Hnmn catiipnc 
O.U111U AajJldld 




5334 


00 
yy 


421 


L28125 


anserina 


Hptft rrflncrfiTtPin-lilf p nrntpin 

UCua UalioUUL'lll _ lll\Cf UICXI1 


2RR 
&oo 


30 


497 


W21733 


T-Tnm Ch canipnc 
xxuiikj iapiciia 


"KJTT^_ 1 pnrnHpH Kv rlnnp SO 


1 10 

1 lu 


79 


423 


S67970 


Homo sapiens 


ZNF75=ICRAB zinc finger 


951 


76 






TT111C/*llltl C 

iViUo Xlllidvlilllb 






OR 


496 


Y73373 


Hfnmn csmipno 


WTRKi rlnnp 091 R03 nrntpin 


j j j 


56 

JO 


497 


Y73373 


Wnmn cnnipnQ 


HTRM clone Q91 R03 nrntpin 

XX 1 JVLV1 1 OVJ JJ1 UlClil 

OuUUVllVVi 


266 


40 


428 


X61118 


Homo sapiens 


TTG-2a/RBTN-2a 


876 


100 


429 


296932 


xxajiixu ^ciL/iCfiia 


nur-lpnr smtnanticrpn fn 14 lrT*ia 


406 


R3 
oj 


430 


AJ277291 


Homo sapiens 


HELG protein 


678 


72 


431 


YR21 S7 


Wnmn canipnc 
IxUilll? oajJlCLla 


npvin 
lie v ill 


359S 


00 

yy 


432 


AC007192 


Homo sapiens 


P85B_HUMAN; PTDINS-3- 


3825 


99 . 


433 


AT 091018 


a jLUliivj japibjij 


h34TR T rFfnmnpl rplatprl 7inr Finapr 

protein 184) 


1713 

i / X J 


50 


434 


AF0R4464 


xvallUo 

norvegicus 


OTP-Kin H in o nrntpin RFX49 


141 


90 


435 


AT 040795 




rJTri99T S 9 /'nnvpl nrntpin^ 

UJU^^XvJ.X U1UVC1 L/i f IE/ill y 


1756 


OR 

7o 


436 


M14513 


Rattus 


(Na+ and K+) ATPase, alpha(III) 

pj^tnlvtip ciiViiinit 


4269 


99 


437 


U33460 


Homo sapiens 


DNA-directed RNA polymerase I, 

lflropcf *mnnrtit 


8777 


98 


438 


D87076 


TTnmn QflnipnQ 


similar to ruinian Hrnmofiomain 

OU1J11CU Lv> 11U111CU1 171 \Jl 11 vfltUl |1 u 111 

orotein BR140CJC2069 > I 

^/IV/LwXll JL/XVl^Vl w f£*\l\U J 


3067 


100 


439 


L43912 


Macaca 
mulatta. 

Ill H 1 CXlAti 


mannose-binding protein A 


589 


93 


440 


D31763 


Homo sapiens 


ha0946 protein is ICruppel-related. 


927 


49 


441 


U70976 


Homo sapiens 


arrestin 


2068 


99 


442 


B08069 


Homo sapiens 


A human beta-alanine-pyruvate 
aminotransferase (HAPA). 


2343 


99 


443 


AF1 00662 


Caenorhabditis 


contains similarity to ubiquitin 


166 


24 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 








car ooxy i-ici juinuj nyuxoiasc ^r^iain. 
1JPH-1 hmm score* 28 461 fPfanv 

sj\^±x~ i .iiiimk) owic. i>o.*tu j lain. 

UCH-2.hmm, score: 47.53) 






. 444 


D78017 


Rattus 
norvegicus 


NFT-A 1 






445 


AL049569 


Homo sapiens 


dJ37C10 3 f novel ATPase) 


2418 




448 


AJ242540 


Volvox carteri 

TWIT V/A Vttl kvl A 

f. nagariensis 


hvoVovvnrolirip-rirh clvrnrvrntpin 

DZ-HRGP 






449 


AJ133352 


Homo sapiens 


ZNF237 protein 


2006 


inn 


450 


AJ133352 


Homo sapiens 


ZNF237 protein 


1025 




451 


AF170708 


Homo sapiens 


T-box protein TBX3 


3700 


99 


452 


AK002080 


Homo sapiens 


unnamed nrotein nroduot 


1 J*tU 


00 


453 


L32977 


Homo sapiens 


Rieslce Fe-S nrotein 






454 


X51760 


Homo sapiens 


7inc finder n rote in f*5R3 AA1 


1533 


^7 


455 


Y01141 


Homo sanien*! 


Spprptp-ri nrntpin pnrnHprl Kv trpnp 7 

clone HTLFA90. 




OO 


456 


AB006631 


Homo <;anien<; 


TTip human hnmnlno nf* ttiahcp 

X llUlliO.11 UVlllVlUg^ Ul JUlWUDG uUA ^ 


UJJ7 


i no 


457 


AF067165- 


Homo sapiens 


zinc finger protein 3 


977 


64 


458 


AF038169 


Homo sapiens 


unknown 


154 


38 


459 


W75214 


Homo sapiens 


Human secreted protein encoded by 
gene 19 clone HRSMC69. 


1180 


95 


460 


U97002 


Caenorhabditis 
elegans 


similar to acyl-CoA dehydrogenases 
and epoxide bydrolases; Pfam 
aomam Jrruu44i \Acyi-uoA__an ), 
o core j / .4, E-vaiue— 1 . /e- io, jn—z, 
contains similarity to Pfam domain 

b VttlUC 1C iJj IN i 


583 


37 


461 


AK023114 


T-Tornn qhthptiq 


HTin?iTTipH tvrrtfpin TYrrviiict 

UllllcUilCU LUvlCill L/I t/ULLlsL 


1 1 


00 


462 


M93134 


Friend murine 

A J 111 HI 1UV 

leukemia virus . 


nol nrotpin 




*H- 


463 


AF055473 


Homo ^anien*; 


GAGE-8 




A7 


466 


Y51415 


Homo sapiens 


Human wild type pKe83 protein. 


2625 


100 


467 


Y51417 


787 


HllTTIft'n T^TfpR^ cr*kli r*P ^/HT-ronl" rvTVYfrpin 
i iuiiia.il ujycOj ajJllv-C V a.1 la.il L prULCUl 




1 OA 


468 


Y57936 


Homo sapiens 


Human transmembrane protein 

xi i ivjur in~ou. 


1629 


96 


469 


D38552 


Homo sapiens 


The hal539 protein is related to 
cyclophilin. 


2995 


100 


470 


Y70013 


Homo sapiens 


Human Protease and associated 

Tvrntpin 1 CX>X>XtO. 1\ 

proiein- / \.x x lxvj- / ). 


3530 


100 


471 


AJ224747 


Homo sapiens 


C-terminal variant of hlNADL 
lnuiuiiing x. ainino aciu excnanges - 
and an insertion of 28 amino acids in 

frump 

11 U.111 V. 


7969 


100 


472 


.W99665 


Homo sapiens 


Human secreted protein clone 
du 1 57 12 nrotein 


1546 


100 


473 


W99665 


Homo <*anien<* 


T-TllTYlftri QprrpfpH nrotpin rlnnp 

du 1 5 7 12 nrotein 


770 


00 
70 


474 


X63526 


Homo saoiens 


homolooue to elongation fartor 1- 
gamma from A.salina. 




0Q 


475 


XI 5940 


Homo sapiens 


ribosomal protein L3 1 (AA 1-125) 


644 


100 


476 


M60832 


Homo sapiens 


aipha-2 type VIII collagen 


3581 


99 


477 


ATUJ7U7 / 


T-Tnrnn c;*T%ip?tc 

JTHJlilU DCtpiClid 


OllLlgCll IN I ~\^\J~J 1 


19 1 ^ 
I J 


y / 


478 


AFT S6Q9Q 




uiiicmiiiid.Lury r ci>puiibc pruicui o 


1 JOO 


83 


479 


AF264717 


Homo sapiens 


FYVE domain-containing dual 
specificity protein phosphatase 
FYVE-DSP2 


5610 


Q0 


480 


AF044578 


Homo sapiens 


putative DNA polymerase; POL4P 


2478 


94 


481 


X89750 


Homo sapiens 


TGEF protein 


1413 


100 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 
— -. — — . — ■ 


SMITH- 
WATERMAN 
SCORE 


% 

roENTITY 


A OO 

4S2 


. M9j 1 07 


... . 
Homo sapiens 


(R)-3-hydroxybutyrate 
dehydrogenase 


loo3 


96 






— — ; 

Homo sapiens 




1 K<6 


A 1 

41 


A9.A 


Ar ID 1 jdo 


Homo sapiens 


deoxycytidyl transferase* Rev lp 


4^6 1 


99 


AQ< 
Ho J 


ZJyoooH 


Homo sapiens 






73 


486 


AJ243874 


Homo sapiens 


oligophrenin-4 


3682 


100 


a on 
487 


711 111 

L\ 17j7 


Homo sapiens 


flavin-containing monooxygenase 4 


2969 


100 


488 


X56123 


Mus musculus 


talin 


4353 


77 


489 


AJ2781 12 


Homo sapiens 


putative cell cycle control protein 


335 


23 


490 


W74843 


Homo sapiens 


Human secreted protein encoded by 
gene 115 clone HOVJ3A03. 


1013 


98 


491 


Y41337 


Homo sapiens 


Human secreted protein encoded by 
gene 30 clone HRDDV47. 


509 


36 


a no 
492 


X90530 


Homo sapiens 


ragB 


1926 


99 


493 


X90530 


Homo sapiens 


ragB 


1405 


99 


494 


X90530 


Homo sapiens 


ragB 


1893 


96 


495 


a t nil tni 

AL022394 


Homo sapiens 


dJ511B24.3 (K1AA0395 (probable 
homeobox protein)) 


4990 


99 


496 


Y11395 


Homo sapiens 


lanthionme synthetase C-like protein 
1 


, 2168 


100 


497 


AJ010119 


Homo sapiens 


Ribosomal protein kinase B (RSK-B) 


4001 


100 


498 


GO 1563 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 5644. 


330 


100 


499 


X54131 


Homo sapiens 


protein-tyrosine phosphatase 


10465 


99 


500 


G01082 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 5163. 


549 


100 


501 


AC004142 


Homo sapiens 


similar to murine leucine-rich repeat 
protein; possible role in neural 
development by protein-protein 
interactions; 93% similarity to 
D49802 (PID:gl 369906) 


3676 


100 


502- 


AL1 17544 


Homo sapiens 


hypothetical protein 


1226 


100 . 


503 


. AF203032 


Homo sapiens 


neurofilament protein 


5115 


99 


504 


AL034417 


Homo sapiens 


D-K215D1 1.2 (similar to rat gene 33) 


2476 


100 


505 


X69090 


. Homo sapiens 


190kD protein 


7546 


99 


506 


U58755 


Caenorhabditis 
elegans 


coded for by C. elegans cDNA 
yk34bl.5; coded for by C. elegans 
cDNA ykl3hl0.5; coded for by C. 
elegans cDNA yk46e8.5; coded for 
by C. elegans cDNA yk46d5.5; 
coded for by C. elegans cDNA 
yk43c2.5; coded for by C. elegans 
cDNA yk46e8.3; coded for by C. 
elegans cDNA yk43c2.3; coded for 
by C. elegans cDNA yk46d5.3; 
coded for by C. elegans cDNA 
ykl3fl0.3; coded for by C. elegans 
cDNAyk34bl.3 


782 


55 


D07 


AJ293309 


Homo sapiens 


NHP2 protein 


801 


100 


TOO 


U o 9045 


Rattus 
norvegicus 


cytoplasmic dynein intermediate 
chain 2B 


3241 


97 




A "CTIKO O 0 1 

ArOo323 1 


Mus musculus 


cytoplasmic dynein intermediate 
chain 2 


3159 


97 


D10 


Ar202893 


Mus musculus 


KiOlb 


4336 


95 


^1 1 
ji l 


Y1 31 1 S 
i i j 1 1 j 


jtnomo sapiens 


serme/uircomne proiein Kinase 




oo 
99 


512 


AB030207 


Homo sapiens 


G gamma subunit 


364 


100 


513 


AF039571 


Homo sapiens 


peripheral benzodiazepine receptor 
interacting protein; PBR-IP/PRAX1 


495 


33 


514 


AB037883 


Homo sapiens 


Gb3/CD77 synthase 


1916 


99 
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515 


D90868 


Escherichia 
coli 


similar to 




i 00 


516 


X98834 


Homo sapiens 


zinc finger protein Hsal2 


S900 


1 OO 


517 


AF055668 


Mus musculus 


anootosis-l inked eene 4 deltaP form 


7Q04 


/o 


518 


AFO 19926 


Mus musculus 


Dfotein kinase 




QO 


519 


M34513 


Homo sapiens 


omega protein 


337 


91 


520 


Y08612 


Homo lanien^ 


RRlfPis ni id pur nnrp rnmnlpv TMvitp*in 
ooAX/a xiuoicai puic i/UiiijjiGA jJiUlCin 


7*31 1 


00 
yy 


521 


Y08612 


Homo ^anien^ 


RRk'T^a miplpar nnrp rnmnlpY nrrvfpin 

OOiVL^a llUWlC'dl UVJl C IsUUlL/lCA. LFIUIGIII 


1 DO l 




522 


AL096766 


Homo sapiens 


dASQHIR 1 AHA A 0767 nroteirrt 


7407 


1 AO 


523 


AF 186249 


Homo ^ameni 


ciy tr5incmpTnlir5»np pnitViplinl nntio'pn 
oia. ucuioiiiciiiuiaJ.it; cjjiuidiai cuiii^cu 

of* nro<itatp 


1 7QA 


1 OO 


524 


AB029012 


Homo sapiens 


KIAA 1089 protein 


4933 


100 


525 


AB026893 


Homo sapiens 


VfKPiilar nflHhprin-9 

VCUvUKU WOUilwllil id 






526 


X74331 


Homo sapiens 


DNA primase (p58 subunit) 


1720 


100 


528 


AC007228 


Homo sapiens 


R31665 2 


1488 


47 


529 


X14830 


Homo sapiens 


acetylcholine receptor beta-subunit 
preprotein 


2639 


100 


JOK) 


TTR A AA£ 


Caenorhabditis 


coded for by C. elegans cDNA 
ykl72e6.3j coded for by C. elegans 
cDNA yk!58f7,3; coded for by C. 
eicgdnb ci/ina yKiJoi/.j, coueu ior 

K\/ C plpcrnnc rTYWA \/Vl 77 p£ ^ 


420 


39 


531 


S76838 


Mus sp. 


Dbs 


4821 


88 


532 


Z82215 




ujoouz.z ^myosin, neavy 
polypeptide 9, non-muscle) 


yoJ.o 


1 Art 

100 


533 '■ 




jriuinu aajJlcilo 


aUllCan 


2.1 I 


31 


534 


AF300612 


T-TAmrt c c^tvi f*r\ c 
nuiiivj Dapicilo 


IN-aCciyigaiaClOSaminc-^f—VJ- 
cn 1 frifrun cfprn cp 






535 


AL121928 


Homo Qjmipnc * 


Un.loil*t.j vjJlCOJV*>Ll 111 allU OCL/ 

domain rvroteirA 


DODO 


00 
yy 


536 


AJ271055 


Mus musculus 


iroauois homeohox nrntein f\ 


1 /it 


/o 


537 


AF1 80473 


Homo sapiens 


Not2p 


77^7 


inn 


538 


AF071059 


Mus musculus 


zinc Finger RNA binding protein 


1089 


■ 51 


539 


AF023453 


Homo ^aoien^ 


{ipfin-TplnfpH r\rr\tpin ^-i\p<"!» 
aULiii-i cjdLCU pi uicixi _)~UCLa 




1 00 

1UU 


540 


AC003030 


Homo sapiens 


R29828 1 


1401 


70 


541 


AC003030 


Wmrm cnrkipnc 




zzy4 


1U0 


542 


AL121889 


Homo sapiens 


dJ1076E17.1 (K1AA0823 protein 

/rnnrinnpc in AT n7^ff/i'^^ , i 
^wUiIitilUCb in nbv/ZjwJ^ 


2152 


100 


543 


AR00611S 

AUUUU 1 J J 


norvegicus 


HiSR^ 
UDoj 


IZJo 


98 


544 


G02650 


WVvmfi cjitvpiio 


nuiiiaii secreicu protein, oxiv^ wj 
NO: 6731. 


044 


97 


545 


Y07595 


HftTTIO QflTYIPriQ 


tTSincprintirtn fciP+rvr TT7TTH 
U ailaUl ipiiUll lavLUI 1 r liXT 


2.0 Id 




546 


AL133545 


PftTnn QanipriQ 

XX\J111\J dOLFlwlld 


A ^ C ^*NT 1 A 1 /rirtVpl T«"r\tpin cimilor 
UAJOUIN l*t.J ^IIUVCJ U1ULC1I1 allllllcU 

to 51 Hi ml cnppificitv T^nAcnnntscp^ 
lu a Liuai djjcuzijuiiji' !JJlUo|Jll<llaoC7 


704 


yy 


547 


X83618 


Homo sapiens 


hydroxymethylglutaryl-CoA 
cvntha<ie 


2647 


100 


548 


AF134726 


Homo sar>ien<! 


NG37 


^"3^0 

HoDy 


00 
yy 


549 


AB035356 


Homo ^anien*? 


iivuivAiii i-uipiid. ui ULCliJ 




00 


551 


AB037901 


Homo <iarnen<i 


CTPTIP 517TlT\llTlpH trt CrtllOTVl fWI c 

^,C11C oilipiillC'U Li I 4>UUoUlUUo v>Cll 

carcinoma- 1 


DZ1 0 


GO 

yy 


552 


AB043634 


Homo ^anierK 




OOJ 


i no 


553 


AP000693 


Homo sapiens 


nartial CDS 


4R7S 


00 

yy 


554 


AF002223 


Homo sapiens 


. myotubularin related 1 


3490 


100 


555 


AC004893 


Homo sapiens 


similar to NEDD-4 (KIA0093); 
similar to P46934 (PID:gll71682) 


1611 


100 


556 


AJ404468 


Homo sapiens 


axonemal dynein heavy chain 


8328 


100 


557 


AJ404468 


Homo sapiens 


axonemal dynein heavy chain 


11137 


100 
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% 

IDENTITY 


558 


X65873 


Homo sapiens 


kin e sin heavy chain 


4860 


100 


559 


AJ277365 


Homo sapiens 


polyglutamine-containing protein 


592 


36 


560 


AF205600 


Homo sapiens 


transposase-like protein 


407 


27 




X71125 


Homo ^aniens 


olutamiTivl-T^entide cvclotrarmf erase 


1914 

X ^ 1 *T 


X \J\J 


562 


X71125 


Homo sapiens 


glutaminyl-peptide cyclotransferase 


1456 


97 




AJ*tJu*t 




rnvftcin rpo*n1nt"ATV liont rhflin 
iiij Uolll icgulaiuiy ixgixi Uiiaui 


R07 


IDA 


564 


AF250842 


Drosophila 
melanogaster 


multiple asters 


130 


23 


jOj 


I joDUO 


nomo sapiens 


jrroiein regulating gene expression 
PRGE-1. 


io iy 


oo 


J DO 


AT 1 01 fiOa 
/UL1Z107J 


nomo Sapiens 


u/no^is^zi.j ^novei protein similar 

+ri 7*pfir»riiSlnctrimn Hi nHino nrntpin 
i\J I ciiiiUUlaalUllia Uixjuiiig pilHCili 






567 


AL1 17352 

/Vi— / X X / J jAr 


Homn *iflniftns 


dJ876B10 1 frtovel rirotein tortholop: 
of rat EX084V* 


3713 


99 


568 


AF228603 


Homo lanien** 


nleclctilTiTi 1 

U1VVI\JUUJ A> 


1841 


100 


569 


AF239243 


Homo sapiens 


histone deacetylase 7 


3244 


86 


570 


AFflR7rt9S 

^VXTUO/U^J 


iMllQ milCflllllQ 
iVIUO XXXUov*UJUd 


veli 3 


QRQ 

707 


X uu 


571 


ARf)4fi1Rl 


XTXVJiXXU SQjpXCXJjO 


tpQtic-ahiinHnnt finopr nrntpin 

IvdLiA aUUliUalll Xiilg^wJ IJlULvJUl 


1346 


QQ 


572 


AC005551 


Homo sapiens 


R26529_2, partial CDS 


1020 


100 


J ID 


Yon?Qn 

I 7Ui7U 


rruuLiu oapicxia 


XlLUJLiall UCpuilaaC, JXT Awi ~ / jJiULCUJ 

sequence. 


974. 
x /*+ 


jx 


J /*T 




T-J /"irti cotMonc 

numu aapiciib 


riUillall HiXJla IMIU ulIgClLILg UIULCJJJ. 




^9 
JZ 


j / j 


AT 191Q^S 


riuixiu bapiCilo 


HA^17H7 "5 ^LrnmnlPY 1H /'a mnrinp- 
L/n.Jl /nZJ ^i-L-UIIipiCA 1U LilUlillv 


OJ J 


7R 
/ o 


576 




T-Tfvmn cnrnpriQ 

nuiUU dcLL/XCXXA . 


■ T-Tnman cprrpfpH nrntpin T-TVX/MfrT 1^4 

SEOIDN0132 




QQ 


577 


AL121716 


Homo <; aniens 

JL XVli.1 \J 30.LslVXXi9 


dJ202D23 2 Tnovel nrotein^ 


6329 


99 

^ j 


578 


AL121716 


Homo sapiens 


d J202D23 .2 (novel protein) 


6329 


99 


S70 

J / 7 


X9271 S 

A7- / 1 J 


l-fnmrk canipn*i 

XX.UXXXVJ aaUlCXID 


KR AR ICIWl 7mc finder nrotein 


J XUa> 


97 


580 


X54637 


Hnmn canipn*; 


nrrttpin fvrncinp Vinacp 


5554 


70 


581 


X78817 


Homo, tianien*! 

i ivJiiiw sapxc>xxd 


pi 15 


1148 


44 


582 


AJ251245 


Rattus 


SECIS binding protein 2 


3086 


, 71 


583 


AF1 13125 


Homo sapiens 


E-l enzyme 


581 


100 


SR4 
jot 


Ml QS90 

1V1 1 7Ji7 




lUXllouulIl r\ 




OR 


SRS 


API fiQA77 


xxajxxxu bapicixo 


XCLX^/jJlC"! ICUCaL U ailMilCUl Ui aliC 

protein FLRT3 


JH-UJ 


1 UU 


JOU 


fjR7fiRS 

. UO / DO J 


JCXUiliu baLJlCXXs 


DlillllaJ LU llUillaXI U ailD^ripiluU laULUl 

TFES(S34159). 


OUOJ 


00 


SR7 

JO / 


VftftR7fi 


juluxixu aapicxxo 


"Wnman T A."PH»1 nTAtpin cpmiPTiPP 

xiuxiiaii L*>r\jr n i pxuicxii oCLjuciiLfC 


XI 1U 


J.UU 


588 


Y99674 


Homo sapiens 


Human GTPase associated protein- 
25. 


2111 


99 


589 


D86973 


Homo sapiens 


similar to Yeast translation activator 


12033 


99 


J7U 


AT m4452 


T-Tf»nriA canipnQ 

1 iUlllU odUlx^ULD 


helix repeat containing protein) 




iftft 




X J / J7U 


PTnmn carvi pne 


Uiimaii K/copn 7vm p T 

XX-LXXllalL AyoUGllX^llIC JLi X V^H 

polypeptide. 


R14. 


Iftft 

1 \J\J 


592 


A T707743 


XYXUo XLXUdVvUXUa 


tftTCinTi nrntpin 

LUiDUXiJ piULCIJJ 




R5 

O J 


593 


AF164796 


Homo sapiens 


NADH:ubiquinone oxidoreductase 

TV4T RO cuKiinit hnmnlncr 

XVXJLiXX.V^ oltUUilit XJUXXlVJXUg 


469 


100 


594 


Y41312 


Homo sapiens. 


Human secreted protein encoded by 

, gciic j uiuiic ri±jxjxvivx*t j. 


749 


94 


595 


Y41312 


Homo sapiens 


Human secreted protein encoded by 
gene 5 clone HLDRM43. 


824 


100 


596 


. Y77123 


Homo sapiens 


Human neurotransmission-associated 
protein (NTAP) 998868. 


2102 


98 


597 


AF2 15703 


Drosophila 


KISMET-L long isoform 


1880 


65 
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melanogaster 








598 


AF070447 


Homo sapiens 


barrier-to-autointegration factor 


290 


90 


599 


X56203 


Plasmodium 
falciparum 


liver staee antigen 


372 


ZZ 


600 


X79828 


Mus m us cuius 


NK10 


202 




601 


AB004109 


Cricetulus 
griseus 


phosphatidylserine synthase II 


2262 


92 


602 


U94988 


Mus musculus 


Nulpl 


2912 


89 


603 


U9498S 


Mus musculus 


Nulpl 


2800 


oO 


604 


AF006264 


Homo sapiens 


recombination and <H<iter chromatid 
cohesion protein homolog 


?R50 


i no 


605 


AF006264 


Homo sapiens. 


recombination and sister chromatid 
cohesion protein homolog 


2530 


100 


.606 


X82260 


Homo sapiens 


RanGAPl 


2929 


inn 


607 


X82260 


Homo sapiens 


RanGAPl 


1843 


07 


608 


AF1 60909 


Drosophila 
melanogaster 


BcDNA.LD03471 


943 


58 


610 


X74801- 


Homo sapiens 


gamma subunit of CCT chaperonin 


2745 


99 


611 


AL03I427 


Homo sapiens 


dJ167A19 1 (novel nrotein^ 


i^or 


1 on 


612 


Y71072 


Homo sapiens 


Human membrane tran^mnrt nrntpin 

U.U111U11 XJlwllll/1 CUIV* H"" " * J 1 IM^MVIlll^ 

MTRP-17. 




i nn 


613 


XI 6396 


Homo sapiens 


precursor polypeptide (AA -29 to 
315) 


1749 


100 


614 


AK000281 


Homo sapiens 


unnamed nrotein nroduct 


1814 


00 

77 


615 


AB011128 


Homo sapiens 


KIAA0556 protein 


5761 




616 


U19361 


Petromyzon 
marinus 


NF-180 


?os 


91 


617 


AF045555 


Homo sapiens 


wbscrl 


1208 




618 


AF045555 


Homo sapiens 


wbscrl alternative spliced product 


1318 


100 


619 


U22229 


Felis catus 


ribosomal protein L41 


128 


100 


620 


Y17169 


Homo sapiens 


A6 related protein 


1819 


100 


621 


Y12065 


Homo sapiens 


hNop56 


2956 


00 


622 


AF177758 


Homo sapiens 


ubiquitin specific protease 1 6 


2998 


100 


623 


AF3 17425 


Homo sapiens 


GAC-1 


3866 


100 


624 


AL050297 


Homo sapiens 


hypothetical protein 


1227 


99 


625 


AC007204 


Homo sapiens 


BC>73239 1 


JJ70 




626 


268747 


Homo sapiens 


imogen 38 


2024 


99 


627 


Z68747 


Homo sanien*! 


\rr\ r\o en *3R 


17JO 


y / 


628 


Y70229 


Homo sapiens 


Human RNA-associated protein- 10 
(RNAAP-lO'i 


3424 


99 


629 


AF191492 


Homo sapiens 


nasopharyngeal carcinoma associated 
£ene nrotein -R 


613 


100 


630 


AF1 19664 


Homo sapiens 


transcriptional regulator protein 
HCNGP 


1574 


100 


631 


AF 11 9664 


Homo sapiens 


trans criDtional regulator nrotein 
HCNGP 




RQ 

67 


632 


Y 17849 


Homo sapiens 


gangliosi de-induced differentiation 
associated protein 1 


1839 


OR 


633 


X55740 


Homo sapiens 


5'-nucleotidase 


3012 


100 


634 


AF039688 


Homo sapiens 


antigen NY-CO-3 


931 


100 


635 


AF1 19662 


Homo sapiens 


E46 protein 


2424 


100 


636 


AB007836 


Homo sapiens 


Hic-5 


2544. 


100 


637 


AF077818 


Mui mu**cu1n<i 


threonine protein kinase 


0097 


A A 

44 


638 


AL035455 


Homo sapiens 


dJ1018E9.1 (VAMP (vesicle- 
associated membrane protein)- 
associated protein B and C) 


150 


26 


639 


AF078844 


Homo sapiens 


hqp0376 protein 


416 


81 
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IDENTITY 


640 


U28377 


Escherichia 
coli 


ORF f239;.was ORF fl91 and 
ORF fl 94 before splice 


1198 


100 


641 


AK024442 


Homo sapiens 


FLJ00032 protein 


1677 


56 


642 


U58682 


Homo sapiens 


ribosomal protein S28 


340 


100 


643 


X57432 


Rattus rattus 


rihosomal nrotein S2 


1520 




644 


AB002348 


Homo sapiens 


KIAA0350 protein 


5186 


■99 


646 


Y96202 


Homo sapiens 


DcappaB kinase (IKK) binding 

nrotein Y?HS6 
piwt-&iii, ixnju. 


1178 


98 


OH / 




TTiiicpnlnc 


J IN IV UillU-illU, piULCili jrNPwJJJT 1 




oi 


UtO 


ADW7UJJ 


ArahiHfvncic 

•TVLaUiUUpoio . 

tha liana 


enntninc cimilnrirv/ tn icnamvl 
CUULaLLlo dlillliaJHy \AJ L&VciLUyi 

aeetate-hvflrolv7inp 
esterase-gene_id:MQB2.25 


■ nil / 




650 


AC002550 


Homo sanien<; 


Unknown pene nrodnet 


858 




651 


U265Q2 


Homo ^anient; 

A.i\Ji±i\j oapitfiio 


HiahptPQ TTipllihit: tvnp T nntnanfiofm 




DO 


652 


X60155 

1 «7*7 


T*Tfuno lanienQ 

AlUilHJ OCipibllO - 


7inr finiTpr 41 

Z^IJIL' Allied 1 1 




inn 


653 


X53330 


Platynereis 

dumerilii 


H4 protein (AA 1 - 103) 


523 


100 


654 


AC003682 


Homo sapiens 


R27945_2 


2558 


100 






IMiic TTincpiilnc 


rahlQ 


J70 


-JO 


656 


J0964Q 


Raftim 
xvaLiUo 

n nrvf* i n i c 


UJUAJIUWIJ pi UlClii 


£\J L 




657 


AC006014 


T-Trtmn QflnipnQ 

JLlUillVJ Otipi&lld 


similar tn T?T*P tranQ'fnrminff nrntpin* 

similar to PI 4373 fPID*el3251T> 






' 658 


X92972 


f-Tfvmo *ianien*; 

A. Jl Willi/ duUlUlli) 


nrotpin nhomhatacp 


1UUW 


iV\J 


659 


L35269 


Homo ianien<; 


^inr* frnopr nrotpin 

iCilllw IJUlgWI LflUlvJU 


2803 


QQ 

yy 


660 


AC003682 


Homo ^anien*; 


F18547 1 


3184 


7U 


661 


X79204 


Homo sapiens 


ataxin-1 


4195 


99 


662 


XI 7620 


Homo sapiens 


Nm23 protein 


965 


99 


663 


AB015617 


Homo sapiens 


ELKS 


1501 


80 


664 


Z56281 


Homo sapiens 


interferon regulatory factor 3 


2331 


100 • 


OOj 




Pyrococcus 
auyssi 


T A rTAVT ffi T TT A TTJT/'YM'E 

JLAU 1 U Y JLAjI^ U 1 A 1 rllUiN Jb 

J- 1 ADC ^xiv^ *f .4. 1 . 

1VJJC A li J ±jV_Jij X Wy\^rVJL(/\OCry 

(ALDOKETOMUTASE) 
?GLYOXALASED 


254 


40 


666 


Z70200 


Homo ^anien^ 


1 15 <inRMP-<;necific 700kD nrotein 

WoS 0111.VX^(1 OL/wvlllv UlvuiUi 


RR1 Q 


QQ 
yy 


667 


Z70200 


Homo lanien^ 


US snRNP-snectfic 700kD nrotein 


OJO/ 


Q7 


668 


AF153450 


Manduca sexta 


juvenile hormone esterase binding 

nrotpin 
pi Litem 


225 


32 


669 


AF22719S 


Homn Qnnipnc 


CrkRS 


79^1 

/ Z.J 1 


QQ 


670 


X995B6 


Homo sapiens 


SMT3C protein 


441 


87 


671 


Z61589_cdl 


Homo sapiens 


1 7-AUG-l 998 DNA encoding a 
iiunjaii uL*i proicin. 


2593 • 


100 


672 


AJ132702 


Mus musculus 


ATFa-asspciated factor 


3240 


88 






nomo sapiens 


potassium large conuuciancc 
calcium-activated channel beta 3 a 

cnhimit 


1450 


i f\r\ 


674 


G02061 




Wnmnn cpr*rptf*H nrotpin TT\ 

NO: 6142. 




QQ 


675 


G01246 


Homo sapiens 


Human secreted protein, SEQ ID 
NO* 5327 


141 


77 


676 


ABO 16839 


Homo sapiens 


mobl 


419 


42 


677 




nomo sapiens 


similar to myosin heavy chain: 
Containing ATP/GTP-binding site 
motif A(P-loop) 


ioj 




678 


U83115 


Homo sapiens 


non-lens beta gamma-crystallin like 
protein 


8569 


99 


679 


AF203687 


Homo sapiens 


prolactin regulatory element-binding 
protein 


2181 


100 
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DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


680 


M27685 


Mus musculus 


ultra-high sulphur keratin 


650 


.58 


681 


U04968 


Cricetulus 
griseus 


nucleotide excision repair protein 


3712 


97 


682 


AF119663 


Homo sapiens 


G-protein gamma- 12 subunit 


356 


100 


683 


G03733 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 7814. 


342 


100 


684 


X67699 


Homo sapiens 


CDw52 antigen 


297 


100 


685 


AP022789 


Homo sapiens 


ubiquitin hydrolyzing enzyme 1 


1892 


100 


686 


AJ001006 


Mus musculus 


EMeg32 protein 


938 


96 


687 ■ 


W03516 


Homo sapiens 


Prostaglandin DP receptor. 


1864 


100 


688 


AF019661 


Mus musculus 


zeta proteasome chain; PSMA5 


1214 


100 


689 


AF1 56557 


Homo sapiens 


stomatin related protein 


2036 


100 


690 


G03960 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 8041. 


593 




691 


AF161512 


Homo sapiens 


HSPC163 


738 


100 


692 


AL031115 


Homo sapiens 


ZXDA, ZXDB (zinc finger X-linked 
protein) 


4298 


100 


693 


L40410 


Homo sapiens 


thyroid receptor interactor 


806 


100 


694 


AC004542 


Homo sapiens. 


OXYSTEROL-BINDING 
PROTEIN-like; similar to P22059 
(PID:gl29308) 


2533 


yy 


695 


AF169411 


Rattus 
norvegicus 


PAPIN 


4144 




696 


Y58168 


Homo sapiens 


Human hydrolase homologue HHH- 
4. 


2144 


100 


697 


AF271994 


Homo sapiens 


dopamine responsive protein DRG-1 


1613 


TOO 


698 


Y41741 


Homo sapiens 


Human PRO704 protein sequence. 


1323 


100 


699 


AL133506 


Unknown 


/Dredicti on=( method ■ Ven sea n " " 
version: M,, 1.0" H , score:"" 109. 13""); 
/predi cti on=(m etho d: 






700 


Y96870 , 


Homo sapiens 


Human goose-type lysozyme 
(GOLY). 


1032 


• 100 


701 


AC003034 


Homo sapiens 


Gene with similarity to rat kidney- 
specific (KS) gene 


1190 


ion 


702 


AC003034 


Homo sapiens 


Gene with similarity to rat kidney- 
specific (KS) gene 


937 


95 


703 


AJ242832 


Homo sapiens 


calpain 


3756 


100 


704 


S52624 


Homo sapiens 


unknown 


185 


100 


705. 


AF005081 


Homo sapiens 


skin-specific protein 


652 


100 


706 


-Y16793 


Homo sapiens 


keratin, type I 


2232 


100 


707 


Y44985 


Homo sapiens 


Human epidermal protein-2. 


455 


69 


708 


AF1 13220 


Homo sapiens 


MSTP040 


686 


100 


709 


Y44985 


Homo sapiens 


Human epidermal protein-2. 


408 


65 


710 


Y16132 


Homo sapiens 


CDT6 


1874 


100 


711 


Y68775 


Homo sapiens 


Amino acid sequence of a human 
phosphorylation effector PHSP-7. 


2407 


100 


712 


X63422 


Homo sapiens 


H(+)-transporting ATP synthase 


209 


100 


713 


AF1 6996B 


Mus musculus 


DNA binding protein DESRT * 


1467 


79 


714 


X52563 


Bos taurus 


permability increasing protein 


383 


29 


715 


AJ277739 


Homo sapiens 


RPBllbl alpha protein 


480 


98 


716 


AL135791 


Homo sapiens 


bA162G10.3 (zinc finger protein) 


401 


98 


■ 717 


AF223466 


Homo sapiens 


HT0 15 protein 


1311 


97 


719 


AF1 17383 


Homo sapiens 


placental protein 13; PP13 


746 


100 

1 \JKJ 


720 


Z98743 


Homo sapiens 


dJl 81C9.2 (Rho GTPase activating 
protein 8 (RhoGAP, p50RhoGAP)) 


324 


100 


721 


AL163815 


Arabidopsis 
thaliana 


putative protein 


653 


61 


722 


G01436 


Homo sapiens 


Human secreted protein, SEQ ED 


418 


96 
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ID 
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DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 








NO: 5517. 






723 


AF282919 


Mus musculus 


Zfp228 


349 


49 


724 


AB023191 


Homo sapiens 


KIAA0974 protein 


2953 


100 


725 


AL03 1.778 


Homo saDien** 


dJ34B21 1 f novel B7T*P 
(benzodiazapine receptor (peripheral) 

Isoquinoline-binding protein)) LIKE 
protein) 


09 0 

7zu 




726 


AL021939 


Homo sapiens 


dJ352A20.2 (aldehyde 
dehydrogenase familv r*rotf*in^ 


1764 


100 


727 


AF1 82426 . 


Rattus 
norvegicus 


arylacetamide deacetylase 


791 


42 


, 728 


Y08565 


Homo sapiens 


UDP-GalNAc:polypeptide N- 
acetvl £ alactos am in vltran <ifera <se 


3331 


99 


729 


AF155135 


Homo sapiens 


novel retinal pigment epithelial cell 
protein 


1652 


99 


730 


AL078606 


Arabidopsis 
thaliana 


nutative nrotein 


977 




731 


Y73352 


Homo sapiens 


HTRM clone 1732368 protein 
seauence 

u vU ii vlivW • 


■ 1720 


ioo 


732 


AF178432 


Homo sapiens 


SH3 protein 


3302 


100 


733 


Y17832 


Human 

endogenous 
retrovirus IC 


ptiv nrntpin 

V ^JiULGUl 


997 




734 


Y28859 


Homo sapiens 


Human mesoderm induction early 


2067 


98 


735 


U09355 


Oryctolagus 
cuniculus 


nrotein nhowihata^p 9A1 B cramma 
subunit 


9^S9 


00 

77 


,736 


Y94922 


Homo sapiens 


Human secreted Drotein clone nvfi 1 
protein sequence SEQ ID NO:50. 


794 


QQ 

77 


737 


AB027003 


Mus musculus 


protein phosphatase 


378 


84 


738 


AF 112200 


Homo sapiens 


NADH-oxidoreductase B 1 8 subunit 


739 
/ «J7 


100 


739 


AF 112200 


Homo sapiens 


NADH-oxidoreductase B 1 8 <;uhunit 


613 

Ul J 


RR 
oo 


740 


AF302154 


Homo sapiens 


SPG protein 


6556 


TOO 
lvU 


741 


B25681 


Homo sapiens 


Human secreted Drotein seauence 
encoded by gene 17 SEQ ID NO:70. 


1410 


QQ 


742 


L27479 


Homo sapiens 


X123 


1237 


QQ 

77 


743 


L27479 


Homo sapiens 


X123 


1206 


97 


744 


Y66745 


Homo sapiens. 


Membrane-bound protein PRO 1 186. 


588 


QQ 

77 


745 


AJ001019 


Homo sapiens • 


ring finger protein 


1292 


QQ 

77 


746 


X68453 


Sus scrofa 


tubulin-tyrosine Iigase 


1 ooz. 


Q4. 

7*fr 


747 


Y57897 


Homo sapiens 


Human transmembrane protein 
HTMPN-21. 


1173 


100 


748 


AF151069 


Homo sapiens 


HSPC235 


1694 


96 


749 


AF 182404 


Homo sapiens 


mitochondrial uncoupling protein 1 


1674 


100 


750 


AL121993 


Homo sapiens 


dJ776P7.1 (Novel protein) 


2500 


99 


751 


AF 149825 


Homo sapiens 


PACSEN3 


99S3 


TOO 


752 


AL008635 


Homo sapiens 


dJ510H16 2 fhiph-mohilitv prniin 
protein 2-1 ike 1) 




00 

77 


753 . 


Y57914 


Homo sapiens 


Human tran^memhrarif* nrotpin 
HTMPN-38. 


1 194 


mo 


754 


AF285109 


Homo sapiens 


septin 3 isoform B 


1766 


100 


755 


AF004161 


Oryctolagus 
cuniculus 


peroxisomal Ca-dependent solute 
carrier 


9371 


7 J 


756 


Z19585 


Homo sapiens 


thrombospondin-4 


4239 


100 . 


757 


AP001745 


Homo sapiens 


similar to zinc finger 5 protein 


1857 


100 


758 


AF1 90664 


Mus musculus 


LMBR2 


555 


72 


759 


AF090326 


Mus musculus 


AE- 1 binding protein AEBP2 


1540 


97 


760 


AL096677 


Homo sapiens 


dJ322G13.3 (novel protein similar to 


999 


94 
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% 

IDENTITY 








bovine and mouse beta-soluble NSF 
attachment protein (SNAP-beta) ) 






761 


AC003007 


Homo sapiens 


Unknown gene product (partial) 


649 


96 


762 


U66372 


Bos taurus 


ribosomal protein S29 


230 


73 


764 


Y90899 


Homo sapiens 


Dl-like dopamine receptor activity 
modifying protein SEQ ED NO: 1 . 


1152 


100 


765 


U88169 . 


Caenorhabditis 
elegans 


similar tf* molvhdnterin hirmvnthpii^ 

MOEB proteins 


1904 

JL iut 


UJ 


766 


AL1 18506 


Homo sapiens 


dJ591C20.3.1 (novel DnaJ domain 
protein, similar to mouse and bovine 
cysteine string protein) 


1091 


i no 


767 


AK024693 


Homo sapiens 


unnamed protein product 


3767 


100 


768 


Zl 1518 


Horn o sapi ens 


histidyl-tRNA synthetase 


2582 


100 


769 


X13916 


Homo sapiens 


LDL-receptor related precursor (AA 
-19 to 4525) 


25529 


100 


770 


AC009360 


Arabidopsis 
thaliana * 


Contains 3 PFI 00400 WD40 G-beta 
repeat domains. 


333 


J J 


111 


AB037685 


Mus mus cuius 


LANP-like protein 


1246 


91 


112 


AL161578 


Arabidopsis 
thaliana 


outative Drotein 


335 


46 


113 


AL161578 


Arabidonsis 
thaliana 


nutative nrntein 


j jj 


4.7 


774 


AY008271 


Homo sapiens 


helicase SMARCAD1 


5264 


OQ 


775 


Y21591 


Homo sapiens 


Human secreted protein (clone 
CC332-331 


1 127 


96 


776 


W88853 


Homo sapiens 


Polypeptide fragment encoded by 
gene 89. 


752 


100 


111 


W88853 


Homo sapiens 


Polvnentide fragment encnHed hv 
gene 89. 


752 


ion 


778 


W88853 


Homo sapiens 


Polypeptide fragment encoded by 
gene 89. 


752 


100 


779. 


AF 196481 


Homo sapiens 


RING finger protein; FXY2 


3644 


100 


780 


AL035427 


Homo sapiens 


dJ769N13.1 (KIAA0443 protein.) 


1609 


54 ' 


781 


AB026187 


Homo sapiens 


protocadherin-Xa 


5244 


100 


782 


B24458 


Homo sapiens 


Human secreted nrotein spnnence 
encoded by gene 22 SEQ ID NO:83. 


1002 




783 


AB027289 


Homo sapiens 


cyclin-E binding protein 1 


5421 


100 


784 


G02916 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 6997. 


627 


100 


785 


AJ245822 


Homo sapiens 


. type I transmembrane receptor 


4560 


100 


786 


AJ245820 


Homo sapiens 


type I transmembrane receptor 


4624 


100 


787 


Z48042 


Homo sapiens 


GPI-anchored protein pl37 


3340 


99 


788 


AL031782 


Homo sapiens 


dJ708F5.1 (PUTATIVE novel 
Collagen alpha 1 LIKE protein) 


2739 


100 


789 


AJ131245 


Homo sapiens 


Sec24B protein 


6602 


100 


790 


AF 107203 


Homo sapiens 


ataxin 2-binding protein 


2008 


100 


791 


Y14690 


Homo sapiens 


procollagen alpha 2(V) 


600 


34 


• 792. 


AL031055 


Homo sapiens 


dJ28H20.2 (novel protein) 


1267 


100 


793 


Y36194 


787 


Human secreted protein 


2051 


99 


794 


AB028127 


Homo sapiens 


mannosyltransferase 


2138 


96 


795 


AC007228 


Homo sapiens 


R31665 2 


2738 


79 


796 


AL049482 


Arabidopsis 
thaliana 


putative protein 


436 


47 


797 


AC004528 


Homo sapiens 


R32184_3 


891 


91 


798 


AB037830 


Homo sapiens 


KIAA1409 protein 


7532 


100 


799 


X53793 


Homo sapiens 


5* half of the product is homologues 
to Bacillus subtiis SAICAR 
synthetase, 3' half corresponds to the 
catalytic subunit of AIR carboxylase 


2232 


100 
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% 

IDENTITY 


800 


Y99350 


Homo sapiens 


Human PR01378 fUN0715^ amino 
acid sequence SEQ ID NO:33. 


1343 


inn 


801 


AB042636 


Homo saniens 


iunctonhilin tvne"3 


1225 


47 


802 


AB029324 


Rattus 
norvegicus 


TIP120-familv nrotein TTP120R 

xix i x.\j xainny jJiL^it-Lll x xx X x,\J xj 


3916 


00 


803 


AB029324 


Rattus 
norvegicus 


TIP 1 20-familv orotein TTP 1 20R 


4961 


QO 

7V 


804 


AF251040 


Homo **anien<; 


mitativp mirlpsir nrntpin 
jjuiaiivv iiu^ivcu ujuiviii 


21 19 

X 1 1 7 


inn 


805 


AB033281 


Homo sapiens 


F-box and WD-repeats protein beta- 
TRCP2 koform P 


2879 


100 


806 


U87305 


Rattus 


transmembrane receptor UNC5H1 


3257 


90 


807 


AF118889 


Rattus 
norvep'iciis 


b-tomosyn isoform 


3155 


97 


808 


AF226993 


Rattus 


selective LIM binding factor 


8793 


95 


809 


W19919 


Homo sapiens 


Human Ksr-1 (kinase suppressor of 

Rasl 


3939 


99 


810 


AL03 1 782 


Homo campnc 
in i/uiu dapiciLo 


UJ / UOf J.i \T KJ tr\ x 1 V xZi UvJVvl 

Collagen alnha 1 TJTCK nrotpin^ 




inn 


811 


AC002542 


Homo ^aniens 


similar to C ele^anc F1 1 Al 0 V R0% 

oMJ.lii.ai IVJ . WWgCLUd J/ 1 liT.1V/.Jj ou/o 

similaritv to Z68297 fPID-el 130619^ 




inn 


812 


U83246 


Homo sapiens 


copine I 


606 


52 


813 


AF242552 


Gallus gallus 


retinovin 


945 


34 


814 


X52332 


Homo ^aniens 

x xkjxxi\j odyj i 


7"inr fino*pr nrntpin Ifi 

£J11C> lUlgvl |JIULvlii IV 


1 0-J 1 




815 


X52332 


Homo ^aniens 


7inr fiTiPPr nrntpin in 

bUlw llllgw |JlULCill IV/ 




QQ 


816 


Y09631 


Homo sapiens 


PIBF1 protein 


2935 


99 


817 


X71997 


Rattus 

Tinrup(Ti'pnc 
iiui vc^jtua 


myosin I 


3883 


98 


818 


AY004877 


1V1UJ JUUJVUIUd 


pvtnnlacmif* Hvnpin Vipaw i^Vinin *■ 
^j'Lu^i&tSiiii^ u.yiicij.1 xiy^ayy viiaiu 


l 1 1UJ 


051 


819 


Y27196 


Homo sapiens 


Human cyclic nucleotide 

|JliUopiLUUiCdlCI i xJC*Qxj\xjJ alllillU 

acid sequence. 


3790 


100 . 


820 


AF08 1 947 






1 U*f 


o 1 


821 


AL0351O6 


T-Tomr* cnnipnc 

JTlUllll/ dUpidld 


HTOORP1 1 1 ^mntimipc in 
UJ770U1 1.1 v^vUIlLuluCo ill 

Em • AL445 1 92 as bA269H4 1 


o / 1 


t on 


822 


AF022795 


Homo sapiens 


TGF beta receptor associated protein- 
1 


385 


24 


823 


AFO 15770 


Mus mimeiilii^ 


raHif*.al frincrp 

1 Q.\Xl\sCXl 11 Ul^V 




R9 


824 


U82695 


Homo sanien^ 




1444 


00 


825 


X77371 


Me*io cri cetu ^ 
auratus 


COR1 


641 


7R 


826 


AB014576 


Homo sapiens 


KIAA0676 protein 


296 


79 


. 827 


AL049733 


Homo ^aniens 


dJ875H3 1 fAPKl antipen'i 

UJ(J f JIIJiJ, y/VTXVJ ullllgVU J 


1 ^84 


77 


828 


AF222980 


Homo sapiens 


dismoted in Schiyoohrenia 1 nrotein 


4418 


inn 


829 


Z31560 


Homo sapiens 


sox-2 


1683 


100 


830 


AF295773 


Homo <;anien<i 


ral oiianinp nnr-lpotirlp Hiccoriatinn 
stimulator 


471 7 


oo 

yy 


831 


AB041926 


Homo saDiens 


GCK familv kinase MTNK-2 . 


6866 


inn 

1 uw 


832 


L04948 


Saccharomvce 
s cerevisiae 


mitochondrial tran snorter nrotpin 






833 


AJ007012 


Mus musculus 


Fish orotein 

X it'll L./1 VLVUJ 


704 


04 


834 


Z34289 


Homo sapiens 


nucleolar nho^nhoorotein ol 30 


3455 


QQ 


835 


U10991 


Homo sapiens 


G2 


8436 


98 


836 


AF230877 


Homo sapiens 


MIP-T3 


2945 


99 


837 


X58288 


Homo sapiens 


protein-tyrosine phosphatase 


7734 


99 


838 


X56958 


Homo sapiens 


ankyrin (brank-2) 


9631 


100 


839 


AC024791 


Caenorhabditis 
elegans 


contains similarity to beta-lactamases 


370 


24 
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SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


840 


D83197 


Homo sapiens 


ankyrin repeat protein 


802 


99 


841 


AF053711 


Serinus 
canaria 


neurofilament medium subunit 


192 


31 


842 


AF283772 


Homo sapiens 


similar to Homo sapiens ribosomal 
protein L10 encoded by GenBank 
Accession Number L25899 


990 


96 


843 


U76343 


Homo sapiens 


GAB A transport protein 


2992 


OR 

yo 


844 


Y13645 


Homo sapiens ^ 


uroplakin II 


897 


inn 


845 


D21064 


Homo sapiens 


similar to rat general mitochondria] 
matrix processing protease mRNA 
(RATMPP). 


2710 


00 

yy 


846 


AF1 92522 


Homo sapiens 


Niemann-Pick C3 protein; NPC3 


7047 


100 


847 


AF192522 


Homo sapiens 


Niemann-Pick C3 protein; NPC3 


5472 


100 


.848 


X60489 


Homo sapiens 


elongation factor- 1 -beta 


1162 


100 


849 


AC007204 


Homo sapiens 


BC273239J - ^, 


2277 


67 


850 


AC003682 


Homo sapiens 


R28830 1 


2401 


100 


851 


AL121583 


Homo sapiens 


bA3 5 8N2 . 1 (novel protein) 


353 


61 


852 


Z48475 


Homo sapiens 


glucokinase regulator 


3155 


99 


853 


Z83844 


Homo sapiens 


dJ37E16.2 (SH3-domain binding 
protein 1) 


1884 


OR 
yo 


854 


AF233323 


Homo sapiens 


Fas-associated phosphatase- 1 


390 


36 


855 


AF062741 


Rattus 
norvegicus 


pyruvate dehydrogenase phosphatase 
isoenzyme 2 


447 


80 


856 


Y11411 


Homo sapiens 


pristanoyl-CoA oxidase 


3595 


OS 
yo 


857 


M97188 


Strongylocentr 
otus 

purpuratus 


tektin Al 


290 


■ HQ 


858 


AB001105 


Homo sapiens 


hippocalcin-like protein 4 


995 


100 


859 


AF164791 


Homo sapiens 


putative 38.3kDa protein 


1795 


100 


860 


AF298117 


Homo sapiens 


homeobox protein OTX2 


1477 


93 


861 


AFO 15264 


Rattus 
norvegicus 


golgi peripheral membrane protein 
p65 


1820 


81 


862 


X16901 


Homo sapiens 


30kb subunit of RAB30 /74 


1284 


100 


863 


M12140 


Homo sapiens 


envelope protein 


202 


O 1 


864 


AR] 61459 


Homo sapiens 


HSPC109 


815 


98 


865 


AL109983 


Homo sapiens 


dJ718Pll 1 1 /'novel rlaQQ TT 
aminotransferase similar to serine 
palmotyltransferase (isoform 1)) 


AAA 

< T*T t T 




866 


M77183 


Rattus 
norvegicus 


alpha- 1 -macroglobuiin 


227 


4S . 
tj ■ 


867 


AF272663 


Homo sapiens 


gephyrin 


3785 


100 


868 


X75285 


Mus musculus 


fibulin-2 


3258 


87 


869 


X82494 


Homo sapiens 


fibulin-2 


3407 


99 


870 


AJ297743 - 


Mus musculus 


torsinB protein 


169 


43 


871 


AJ278313 


Homo sapiens 


phospholipase C-beta-la 


6258 


99 


872 


AF073344 


Homo sapiens 


ubiquitin-specific protease 3 


256 


43 


873 


Y91955 


Homo sapiens 


Human cytoskeleton associated 
protein 10 (CYSKP-10). 


535 


inn 


874 


AJ000414 


Homo sapiens 


Cdc42- interacting protein 4 


1 136 


S3 
jj 


875 


AF265555 


Homo sapiens 


ubiauitin-coniueatine BIR-domain 
en2yme APOLLON 


627 


inn 


876 


Y48586 


Homo sapiens 


Human breast tumour-associated 
protein 47. 


2537 


OX 
yo 


877 


AF182198 


Homo sapiens 


. intersectin 2 long isoform 


8764 


99 


878 


L17308 


Gossypium 
hirsutum . 


proline-rich cell wall protein 


192 


35 


879 


AF177169 


Homo sapiens 


tropomodulin 2 


1769 


100 


880 


W03627 


Homo sapiens 


Human follicle stimulating hormone 
GPR N-terminal sequence. 


210 


23 
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SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


881 


AL021068 


Homo sapiens 


dJ206D15.3 


2615 


99 


882 


AC005498 


Homo sapiens 


R31665 2 


318 


82 


883 


AF165518 


Homo sapiens 


MAGOH isoform 


182 


y** 


884 


D21211 


Homo sapiens 


protein tyrosine phosphatase (PTP- 
BAS, type 3) 


368 


AT. 


885 


U13045 


Homo sapiens 


nuclear respiratory factor-2 subunit 
beta 1 


869 


62 


886 


X52836 


Homo sapiens 


tryptophan hydroxylase (AA 1 - 444) 


2320 


70 


887 


X51466 


Homo sapiens 


elongation factor 2 


4460 


100 


888 


AB039903 


Homo sapiens 


interferon-resnonsive fin per nrotein 1 
long form 


1096 


OR 


889 


X51760 


Homo sapiens 


zinc finffer "Drotein f583 AA^ 


3130 


inn 


890 


AJ243396 


Homo saniens 


volta&e-ffated sndium channel 

subunit 




inn 

JUL/ 


891 


W67928 


Homo sapiens 


Fragment of human secreted protein 
encoded by gene 4. 


391 


100 


892 


AB020598 


Homo sapiens 


peptide transporter 3 


3017 


100 


893 


Y66648 


Homo sapiens 


Membrane-bound nrotein PROl 1?0 


4722 


yy 


894 


Y66648 


Homo saniens 


VTernhrane-hoiinH nrntein PROl 1 70 




yo 


895 


A29218 cd 
1 


Homo saniens 


19-NOV-1998 DNA encodine G- 
protein coupled 7 TM receptor with 
AXOR1 5 activity. 


917R 


inn 


896 


AJ000332 


Homo sapiens 


Glucosidase II 


5063 


99 

yy 


897 


X98259 


Homo sapiens 


M-ohase Dhosnhonrotein 8 


1085 


100 


898 


X57110 


Homo sapiens 


c-cbl protein 


4849 


99 

yy 


899 ^ 


X63652 


Homo sapiens 


inter-alpha-trypsin inhibitor heavy 
chain ITIH1 


. 3376 


98 


900 


X85134 


Homo sapiens 


RB protein binding protein 


2816 


OQ 

yy 


901 


LI 1672 


Homo sapiens 


zinc finder nrotein 


2047 


SR 

JO 


902 


Y85565 


Homo sapiens 


Human homologue of UNC-53 (Hs- 
UNC-53/2) sequence. 


~ 369 


R3 


903 


X54871 


Homo sapiens 


ras related protein Rab5b 


1094 


100 


904 


Z98265 


Homo sapiens 


plakophilin 3 


4065 


100 


905 


AL035295 


Homo sapiens 


hypothetical protein 


959 


99 


906 


AF051782 


Homo sapiens 


diaphanous 1 


801 


35 


907 


AF208536 


Homo sapiens 


nucleotide binding protein; NBP 


1372 


100 


908 


U79240 


Homo sapiens 


serine/threonine protein kinase 


2365 


98 


909 


U79240 


Homo sapiens 


serine/threonine protein kinase 


2386 


99 


910 


A J 132545 


Homo sapiens 


protein kinase 


2921 


inn 


911 


AJ132545 


Homo sapiens 


protein kinase 


1637 


QO 
yy 


912 


AL121733 


Homo sapiens 


hypothetical protein 


1344 


00 

yy 


913 


Y67579 


Homo sapiens 


Human death inducer-obl iterator 1 
(DIO-1) polypeptide. 


1586 


inn 


914 


X87342 


Homo sapiens 


Human giant larvae homologue 


5317 


99 


915 


X87342 


Homo sapiens 


Human giant larvae homologue 


3495 


y\j 


916 


M94362 


Homo sapiens 


lam in B2 


2357 


93 

yj 


917 


AJ011654 


Homo sapiens 


triple LIM domain protein 


3432 


100 


918 


AJ131899 


Rattus 
norvegicus 


proline rich synapse associated 
protein 1 


5776 


88 


919 


AF054986 


Homo sapiens 


putative transmembrane GTPase 


1816 


100 


920 


U95822 


Homo sapiens 


putative transmembrane GTPase 


1237 


100 


921 


Y11588 


Homo sapiens 


apoptosis specific protein 


1492 


inn 


922 


X84195 


Homo sapiens 


acylphosphatase 


510 


100 


923 


U72882 


Homo sapiens 


interferon-induced leucine zipper 
protein 


1409 


99 


924 


AE000660 


Homo sapiens 


hADV36Sl 


573 


100 


925 


AF 126245 


Homo sapiens 


acyl-Coenzyme A dehydrogenase- 8 
precursor 


2162 


100 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


926 


AE001968 


Deinococcus 
radiodurans 


hvnothptipal nrntpin 




2.1 


927 


W81576 


Homo sapiens 


EBV-induced G-protein coupled 
receptor (EBI-2) polypeptide. 


1778 


100 


928 


U01317 


Homo sanieim 




ACT 


94 


929 


X98333 


Homo sapiens 


organic cation transporter 


2933 


100 


930 


Y91444 


Homo sapiens 


Human secreted protein sequence 
cutoacu oy gene hz anv^ id 


1401 


100 


931 


Y91644 


Homo RanieTK 


nuiiiau &CL1CICU UIULCUl aCUUCIluC 

encoded bv pene 4"? SFO TD 
NO:317. 


IJ.H J 


100 


932 . 


D90279. 


Homo sapiens 


collagen alpha 1(V) chain precursor 


569 


39 


933 


Z31560 


Homo sapiens 


sox-2 


1587 


96 


934 


AF147790 


• Homo sapiens 


transmembrane mucin 12 


3047 


99 


935 


Z85996 


Homo sapiens 


match: multiple proteins; match: 
Q08151 P28185 Q01 1 1 1 Q43554; 
match: Q08150 Q40195 P20340 
Q39222; match: Q40368 P36412 
P40393 Q40723; match: CE01798 
Q38923 Q40191 Q41022; match: 
Q39433 Q40177 Q40218 Q08146; 
match: P10949P11023 Q16948 
\IjL\j5oi, matcn. Kizojoy rZDzzo 
rzuj jo rUj / io, maiCn. rjjz /o 

OftK147 P17fiflQ P7717R- matrh- 
015771 6410 P3^?Q1* (TTP- 
binding 


■726 


94 


936 


AB041533 


Homo sanierK 


cnprm anticpn 


1 yJO^r 


'i q 

JO 


937 


X91906 


Homo satyiens 


VOita(TP-<yatpH phlnriHp inn f*nannpl 
vviLa^t- gcu-wu Vsiiiunuc IUU UlaJLUCl 


jy 14 


IUU 


938 


AB032481 


Homo sapiens 


nompfinoY tr^n QpTintirvn *f5»f»trvr 


1 1AA 


i on 


939 


AF3 ] 1 106 


Homo ^aoierK 


nrofpin cprmp/fiirprintnp nhncrvh -uta cp 
4 refill atorv subimit 1 

< A wguiUlVl T JUL/ X f 




oo 
yy 


940 


Y17999 


Homo sapiens 


Dyrk 1 B protein kinase 


3331 


99 


941 


AF305872 


Homo sapiens 


th Vm o 1 piKi 1 1 in 




yZ 


942 


AF263462 


Homo sapiens 


cingulin 


5939 


99 


943 


AK024442 


T-Tmnn QStnipric 


FT TA0n^9 rvmtpin 


lolo 


61 


944 


Y35911 


Homo sapiens 


Extended human secreted protein 

Qpmipnrp WO TFi >JO 1 AO 


262 


35 


945 


AB015370 


Hnmn Qflrtipnc 


adaptor complex 




71 


946 


Z82287 


Oa pn n rfi a h H i ti c 

elegans 






o c 

35 


947 


D84223 


Homo sapiens 


leucyl tRNA syntlietase 


6207 


99 


948 


U49057 


Ratnis 
norvegicus 


rA9 


J OHO 


/CO 


949 


AK000568 


Homo sapiens 


unnamed protein product 


1659 


100 


950 


AL021578 


T-Tomo Qariipnc 


HTA^^f"*17 A 1 /nn/*n<ii*ar*triri'»pr1 
uj*t J J^1Z..U.1 ^UIldlaTaClCriZcU 

bvnrtthnlamnc nrnfpin /'lerk-fXi-m 
Ujr|JUUJa.laiIlUa UIULCU1 ^loUivIIII 11) 


ZD/ 


42 


951 


AB032435 


Homo ^anieTK 


denpnHp.tif inorcanir T^nrmnhatp 

cotransporter 




OQ 

yy 


952 


AF1 10532 


Homo sapiens 


uncounliTiP nrotpin T IPP-^l 


1 ^^1 


i on 


953 


X83587 


Mus musculus 


1A13 protein 


1420 


59 


954 


AL031665 


Homo sapiens 


dJ545L17.5.1 (novel protein) 


386 


53 


955 


Y87600 


Homo sapiens 


Human fatty acid synthase-like 
protein (HFASLP). 


2377 


100 


956 


Y99421 


Homo sapiens 


Human PR01433 (UNQ738) amino 
acid sequence SEQ 3D NO:292. 


522 


55 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 


% 

IDENTITY 


957 


U68535 


Mus musculus 


aldo-keto reductase 


451 


73 


OSS 


APO07067 


ArahiHcmsiQ 

ill UUIUUL/JU 

thaliana 


T10O24 10 


1 S94 

lJ7*r 


S7 


959 


U72194 


Mus musculus 


muskelin 


3947 


99 


06ft 


AFftni661 


Tirncnnhila 

J— '1 VJoD L/llllu 

melanogaster 


Pf*1 SI 68 optiA mvtHnot 

\^\J 1J1UO g&ilO UXUUUvL 


277 


Sd 


961 


X80332 


TVAiic TniicniliiQ 

JVlUi) UJUOvUluS 


rab20 


983 


R9 


962 


Y67315 


Homo sapiens 


Human secreted protein BL89_13 

nmirift nr*iH cpniiAn^p 
OllllllU aL-JU ddJUCIllsC 


3916 


99 


061 


Y6711 S 


xj.ujj.il* aapicuo 


T-Tnmnri cpf*rpf"pH nrrttpin *RT RO 11 
nuiiiaii m^uclcu ljiulcui Di-/07 ij 

amino acid sequence. 


1016 
jy i\) 


00 

yy 


y\j*r 


. I 19609 




hnmpnHnmain 1 SO 141 


1 R91 

l Ox, 1 


06 

70 


965 


297832 


Homo ^aniens 


dJ329A5 3 CK1AA06460 Droteinl 


3581 


99 

yy 


966 


W88995 


Homo sapiens 


Polypeptide fragment encoded by 

ctptip 146 


. 176 


.39 


967 


U12465 


Homo sapiens 


ribosomal protein L35 


604 


100 


06R 






Pf? T-4S nrntpin 




7R 
/o 


969 


W74865 


Homo sapiens 


Human secreted protein encoded by 

ffpnp 1 17 rlnnp ITM"\X/TF1S 
t^ciic id 1 dune nivi wjur 


1348 


98 


970 


L21936 


Homo sapiens 


succinate dehydrogenase flavoprotein 


703 


100 


971 


AJ133521 


Drosophila 


protease, reverse transcriptase, 

nnnTiiiPlooco T-T intofn*ac/> 


194 


23 


077 
y /x 




JiUilJU aapiCub 


similar to 010471 fPTTVol 70QSSQY 


1971 


inn 


973 


Z81317 


Schizosacchar 


DNA2-NAM7 helicase family 

rtrntpin 


685 


31 


974 


IVi. 1 / OOJ 




ariWfr rfHncnmal nhncnlinnrofpin fPftl 


7Q? 


100 


975 


U22829 


Mus musculus 


P2Y purinoceptor 


399 


40 


076 
y i \j 


AT 1 1977? 


riuuit) bapicus - 


HTimiA99 1 fhpnatir* rm rip sir •farfnr 
*t, OILM1QI 


9J.66 


00 

77 


977 


AP001971 


T-Tnmn cathptiq. 


ZNF91L 


1550 


41 


978 


J04031 


Homo sapiens 


MDMCSF (EC 1.5.1.5; EC 3.5.4.9;. 
EC 6 3 4 31 


2824 


63 


979 


AF11671 S 

ill 1JU / 1 J 


T-Trvmrv canipnc 


tavol rpc infant aQQrvriatpH lYiwfp'in 


217 


76 


980 


AF136715 


Homo sapiens 


taxol resistant associated protein 


306 


95 


yo l 


709090 


^acnuriiauuiiis 
elegans 


7Y S9fi 1 


1 1 no 

1 1 \)y 




982 


AJ295149 


Homo satnerm 


nutative dinentidase 


1564 


99 


. 983 


AL021331 


Homo sapiens 


dJ366N23.3 (KIAA0173 and 
Tubulin-Tyrosine Ligase LUCE) 


1492 


100 


984 


AL161501 


Arabidopsis 
thaliana 


putative adenosine deaminase 


370 


38 



TABLE 3 



SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


2 


BL00282 . 


Kazal serine protease inhibitors family 
proteins. 


BL00282 16.88 4259e-14 97-120 


3 


BL00298 


Heat shock hsp90 proteins family 
proteins. 


BL00298A 10.97 L000e-40 74- 
119 BL00298E 27.30 1.000e-40 
321-376 BL00298F 11.21 l.OOOe- 
40 409-464 BL00298H 20.50 
1.000e-40 553-607 BL00298C 
16.40 2.286e-40 186-230 
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SEQ 
ED 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








RT 002QRR 1 S 64 1 ?QfW» 30 1 34_ 

181 BL00298G 24.57 5.345e-39 
465-520 BL00298I 30 07 7 R1 Rp- 
34 661-715 BL00298D 17.97 
6.226e-33 242-282 


4 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237A 1 1 48 4 3 1 fie- 1 3 S7-R? 


5 


PD02454 


! ! ! ! PROTEIN ALU SUBFAMILY 
WARNING ENTRY NUCLEAR 
PHOSPHO. 


PD024S4B 1 1 fil 4 30Qp-17 7S- 
103 


6 


DM00864 


EGF-LIKE DOMAIN. 


DM00864A 15J21 7.429e-09 9S- 
119 


7 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00737A 1 1 4R 1 7Sf)p-1 1 7Q-SA 

PR00237D 8.94 7.000e-09 138- 
160 PR00237B 13.50 8.250e-09 
61-83 


9 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5.667e-15 272-289 


10 


BL00139 


Eukaryotic thiol (cysteine) proteases 
cysteine proteins. 


BL00139D 9.24 4,400e-ll 391- 
408 BL00139A 10.29 7.511e-09 

67-77 


12 


BL01113 


C 1 q dom ain proteins . 


BL01113B 18.26 9.294e-l 9 689- 
725 BL01113C 13.18 4.857e-ll 
757-777 BL01113D7.47 2.161e- 


13 


BL01113 


CI q domain proteins. 


BL01113B 18.26 3.8l3e-14 599-. 

63^ RT 01 1 1 3P 1 3 1 8 4 R^7a-1 1 

667-687 BL01113D7.47 2.161e- 
10 700-710 

lv / \J\J / Jv 


14 


BL00594 


Aromatic amino acids permeases 
proteins. ~ 


BL00594A 16.75 6.53 le-10 50-94 


15 


BL01047 


Wftaw-nietsl-a<iQnriatpH Hnmflin nrntpinc 
xx\><i\ j in w Lai (wouk/i(m>u LKJliluiu ui uiciiio. 


"RT f»1fl47R 1 0 73 4 Q1 3p-13 7fl7- 

728 


16 


PR00625 


DNAJ PROTEIN FAMILY 
SIGNATURE 


330 PR00625B 13.48 3.939e-15 
340-361 


18 


BL00615 , 


C-type lectin domain proteins. 


BL00615A 16.68 3.700e-09 144- 
162 


20 


PR00741 


GLYCOSYL HYDROLASE FAMILY 
29 SIGNATURE 


PR00741D 16.11 9.082e-21 175- 
195 PR00741F 14.66 9.262e-21 
243-265 PR00741B 14.23 1.947e- 
18 128-145 PR00741G9.29 
2.180e-17 318-340 PR00741C 

0 16 7 37Rp-17 147 lfiA 
".10 / ji6c-i / i*+/-ioo 

PR00741H 10.32 2.141e-13 351- 

374 PR00741A 9.24 3.596e-13 

R9-10S PR00741F 13 30 3 S3V- 

12 215-232 


22 


BL00107 


Protein kinases ATP-bindin^ region 
proteins. 


BL00107A IS 39 3 647p-90 1 17. 
148 BL00107B 13.31 1.000e-16 
182-198 


23 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 1.600e-23 126- 
157 


24 


BL00107 


Protein kinases ATP-binding region 
nroteins 


BL00107A 18.39 1.600e-23 126- 
157 


27 


BL00239 


Receptor tyrosine kinase class II proteins. 


BL00239B 25.15 2.324e-16 91- 
139 


28 


BL00018 


EF-hand calcium-binding domain 
proteins. 


BL00018 7.41 3.250e-10 681-694 
BL00018 7.41 6.400e-10 717-730 


29 


BL00018 


EF-hand calcium-binding domain 


BL00018 7.41 3.250e-10 681-694 
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SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 








proteins. 


BL00018 7.41 6.400e-10 717-730 


30. 


■BL01113 


CI a domain nrotein<; 


RT 01 1 1 ^ A 17 QQ Q ^flRp ftQ ^4 Rl 


33 


PD01168 


SYNTHETASE LIGASE PROTEIN 
ALANYL. 


PD01168L9.47 1.667e-09 401- 
416 


34 


PD01168 


SYNTHETASE LIGASE PROTEIN 
ALANYL 


PD01168L 9.47 1.667e-09411- 


36 


PR00426 


C5A-ANAPHYLATOXIN RECFPTOR 
SIGNATURE 


i IvvvHZiUJL' 1U.J7 J.OIOC-IZ 1 IV/- 

122 


37 


PF00791 


Domain present in ZO-1 and Unc5-like 
netrin receptors. 


PF00791B 28.49 2.049e-10 1080- 
1135 


38 


BL00350 


MADS-box domain proteins. 


BL00350 20.79 1.000e-40 1-55 


40 . 


BL00123 


Alkaline phosphatase proteins. 


BL00123B 19.31 1.000e-40 90- 
133 BL00123C 24.61 1.000e-40 
145-195 BL00123E 22.25 l.OOOe- 
40 304-358 BL00123G 26.01 

l.UUUe-4U 4jo-4o(S bLln)123r 

19.03 8.714e-35 364-399 
BL00123A 10.80 9.000e-24 52-77 
BL00123D 12.73 l.OOOe-17216- 
229 


.44 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 2.800e-14 346-359 
PD00066 13.92 4.600e-14 486-499 
PD00066 13.92 1.000e-13 374-387 
PD00066 13.92 6.000e- 13 458-471 
rLKJOOoo 13.92 2.714e-12 234-247 
PD00066 13.92 3.143e-12 430-443 

ri^UUUOO u.yz o. / 14e-12 j 14-D2/ 

PD00066 13.92 3.739e-ll 402-415 
PD00066 13.92 2.038e-10 318-331 


45 


■DM0OQ73 


YLL028W CYCLOHEXIMIDE. 


TW/fftnOTJ A *51 m Q/(/!o in ion 

LJJYLUuy / ja z i . l / 2.y4oe- 1 u 1 oU- 
217 ' 


47 


BL00649 


G-protein coupled receptors family 2 
proteins. : . 


BL00649C 17.82 1.682e-10 475- . 
501 BL00649B 20.68 7.3S7e-09 
417-463 


50 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 8. 200e- 16 445-458 
PD00066 13.92 5.846e-15 305-318 
PD00066 13.92 l.OOOe- 14 221-234 
PD00066 13.92 l.OOOe- 14 417-430 
PD00066 13.92 2.800e-14 249-262 
rJJOOOoo 13.92 2.800e-14 277-290 
PD00066 13.92 8.800e-14 333-346 
PD00066 13.92 9.400e-l 4 361-374 
PD00066 13.92 4.000e-13 389-402 
PD00066 13.92 6.571e-12 473-486 


Si 


£>lAJUZZO 


unci iiicQiaic iiiamenis proieuns. 


jdLUvZZou iy.10 1.0U0e-40 417- 
,464 BL00226B 23.86 3.348e-35 
zji-zyy Jtsi^uuzzou ij.zj i.4zye- 
24 316-347 BL00226A 12.77 
l,S57e-15 151-166 


52 


PR00217 


43 KD POSTSYNAPTIC PROTEIN 
SIGNATURE 


PR00217C 10.91 5:648e-09 133- 
149" 


53 


BL00232 


Cadherins extracellular repeat proteins 
domain proteins. 


BL00232B 32.79 1.000e-40 143- 
191 BL00232A 27.72 2.350e-28 
4y-e2 J3LUU232B 32.79 7.052e-21 
252-300 BL00232C 10 65 6 625e- 
20 250-268 BL00232B 32.79 
1.314e-ll 367-415 BL00232C 
10.65 9.308e-10470-488 


54 


BL00303 


S-100/ICaBP type calcium binding 


BL00303B 26.15 8.759e-23 125- 
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CPA 

ID 

NO: 


NO. 










nr At pi it 


1 £9 QT (\C\1f\1 A 11 "7*7 1 fiAA„ o 1 

loz dLUUjUJA z J. / / 1.000e-21 
82-119 


58 

JO 


PROOFS 


TNO^TTOT PW09PWATAQF 

SIGNATURE 


x'KUUj/oJJ 10. oo J.UUue-15 242- 

261 PR00378B 13.80 9.250e-13 

l no 190 
iuy-izy 


59 


PR00425 


BRADYKTNTTKI RFPFPTHP 
SIGNATURE 


140 


60 


BL00280 


Pancreatic trypsin inhibitor (Kunitz) 
family proteins. 


BL00280 24.61 6.727e-38 238-282 
BL00280 24.61 1.514e-30 294-338 


65 


BL01019 


ADP-ribosylation factors family proteins. 


BL01019A 13.20 1.222e-ll 43-83 


68 


PR00237 


RHODOPS EN-LIKE GPCR 
SUPERFAMDLY SIGNATURE \ 


PR00237E 13.03 5.091 e- 13 188- 
212 PR00237G 19.63 7.207e-13 
268-295 PR00237A 11.48 4.375e- 
11 24-49 PR00237C 15.69 
j.057e-10 101-124 PR00237D 
8.94 4.750e-10 137-159 
rtWxjZo/r i j.j/ D.3o4e-10 230- 
255 PR00237B 13.50 9.438e-10 . 

Ji-Iy 


70 


. PD01066 


PROTEIN ZINC FINGER ZINC- 


PD01066 19.43 7.93 8e-28 31-70 


71 


PR00830 


ENDOPEPTTDASE LA (LON) SERINE 
PROTEASE (SI 6) SIGNATURE 


PR00830A 8.41 8.759e-12348- 
368 


79 




juipases, serine proteins. 


DL0U12U1J 11.37 2.149e-10 148- 
163 


11 


PR00753 


1 - AMINOCYCLOPROPANE- 1 - 
SIGNATURE 


PR00753E 8.01 3.552e-l 1 191- 
zlo rKt)0753D 6.S5 2.778e-09 
131-153 


/ o 


ppno.^06 

rl\\j\jj\j\j 


H91 PT A^"W£ AT^FTtfTMP QPPrTPTP 
UjL 1 LLAoj INO ZvL^iilNiiNli-orXlwrlU 

DNA METHYLTRANSFERASE 
RTGNATTTPF 


rKUUDUoC xyAv o.Ul /e-09 96- 

119 


82 


BL 00107 


PrAtpin Unncpc A TP— VvinHincr rAoi/\n 

proteins. 


XjL>\J\J 1U / A lo.jy J.J / ie-io 43o- 
467 


84 


BL00675 


Sigma-54 interaction domain proteins 

A 1 P_nin Hit*irr vt* m /"\t"i A rtTnfaii^p 

r\ ir^-DintLing region /\ proteins. 


BL00675A 24.86 8.800e-10 256- 

3UU 


85 


BL00027 


'Homeobox 1 domain proteins. 


BL00027 26.43 2.286e-30 1 17-160 


87 


BL00250 


TGF-beta family proteins. 


BL00250A 21.24 6.786e-36 264- 
300 BL00250B 27.37 1.450e-26 
32.8-364 


01 


ht firm ^ 


Mitochondrial energy transfer proteins. 


"DT AAI 1 f A 1 C OO A Ot A 1 1 A o c 

oL00215A 15.82 9.250e-17 10-35 . 

nr AAO I'C A 1C OO £. AAA- n 1 

15L0U215A 15.82 o.000e-16 221- 
246 BL00215A 15.82 7.857e-12 

11 168-181 


92 


RT 00077 


'HnmpfinnY 1 Haiti st in TirAt^i-nc 


r5J_/UUUi / Z0.4j y. JZ0e-z4 3z4-3o / 


95 


PR 00094 


ADF'MVT ATP KTWAW <s TfThJ A TT TP F 


punnno^r* n o/i i aaa** c\q no 
Jrxuui/y4c iz.y4 i.uuue-Uo iiy- 

136 


96 


PD09397 


PRECURSOR IMMUNOGLO. 


PPinO^TTR 10 04O AOl ^ AO 1 /t"3 

jtjjvjzjz /u iy.<s4 z.uyie-uy 143- 
165 


97 


BL00752 


VT> A nrntpin 


"RT nmoD in m onoa no to 


98 


PR00R76 


SIGNATURE 


JrKUUo/OJt> /.OO Z^Zooe-IU 13D- 

149 


99 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00I09B 12.27 9.824e-12 122- 
141 


100 


BL00027 


Homeobox' domain proteins. 




101 


BL00028 


Zinc finger, C2H2 type, domain proteins. 


BL00028 16.07 6.870e-12 370-387 
BL00028 16.07 6.885e-ll 398-415 
BL00028 16.07 8.269e-ll 342-359 
BL00028 16.07 4.300e- 10 229-246 
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ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








BL00028 16.07 6. lOOe- 10 258-275 


102 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 7.750e-14 665- 
679 PR00048A 10.52 8.500e-14 
581-595 PR00048A 10.52 9.250e- 
14 637-651 PR00048A 10.52 
2.059e-12 609-623 PR00048A 
10.52 2.588e-12 469-483 
PR00048A 10.52 7.353e-12 553- 
567 PR00048A 10.52 2.895e-ll 
525-539 PR00048A 10.52 4.31 6e- 
11441-455 PR00048A 10.52 
5.263e-l 1 413-427 PR00048B . 
6.02 2.125e-10 569-579 r 
PR00048B 6.02 4.93Se-10 513- 

^93 ppnnft48A m ^ £0£» in 
jzj ri\\j\j\jHor\. iu.jz j.oyoe-iu 

497-511 PR00048B 6.02 8.875e- 

10 429-439 PR00048B 6.02 

1.000e-09 457-467 PR00048B 

6.02 6.684e-09 485-495 


103 


PR00195 


DYNAMIN SIGNATURE 


PR00195A 1 1.94 5.364e-22 31-50 
PR00195B 9.47 1.783e-21 56-74 
ppnmo^p 1 1 $(\ 3 A^^*> 91 i9^ 

144 PR00195D 11.76 8.71 4e-21 
175-194 PR00195F 16.20 8.500e- 

90 917-937 PP001QSPQR9 
ZV Z 1 1 -ZJ 1 rrMVLyDSZi 7.0Z 

8.650e-20 194-211 


.. 104 


BL01113 


Clq domain proteins. 


BL01113A 17.99 1.865e-09 121- 
148 BL01113A 17.99 5.846e-09 

R9.109 

Oi"lU7 


105 


BL00420 


Speract receptor repeat proteins domain 
proteins. 


BL00420A 20.42 6 400e-l 1 70-99 
BL00420A 20.42 8.525e-10 73- 

109 ft T 0049 OA 90 49 ^ 70Rp-OQ 

85-114 


108 


PR00S60 


VFRTFBRATF MFTAT T OTWTO>JFrNr 
SIGNATURE 


PT?OOR60T* 7 OA 9 Q9Q*» 90 97 41 
rivuUoQvD /.Un z,yzye-zu z 

PR00860A 5.46 5.500e-16 5-18 

PR00860C9.61 1.474e- 14 41-51 


112 


BL01031 


■Heat shock h<;n9ft nrotfinQ ■fflmilv nrnfilf* 


pit 0103 ir 1 7 6R 4onp-io 199- - 
147 ' 


114 


DM01840 


kw SPAC24B11.09 R07E5.13. 


DM01 840B 22.04 2.688e-40 59- 
103 DM01S40A 10.95 9.571e-13 

31-43 


115 


BL01126 


T^loTiffation factor Ts nrotptnQ 


TIT 011 96A 1 R 4R 9 3 1 7p 3ft 46-RQ 

BL01126B 13.15 7.387e-19 116- 
135 BL01126C9.20 9.735e-ll 
190-203 


116 


BL00216 


Sugar transport proteins. 


BL00216B 27.64 4.375e-21 35-85 


118 . 


BL00437 


Catalase proximal heme-ligand proteins. 


BL00437A 18.82 1.000e-40 49- 
l\Jl X5JUUU43/D lo.zo i.uuue-4U 

1 14-168 BL00437C 21 .86 1 .000e- 
40 190-239 BL00437D 25.72 

1 OOOp-40 94R-301 *RT 0043 7F 

23.95 1.000e-40 327-379 


119 


BL00140 


Ubiauitin carboxvl-terniinal hvdrolase 
family 1 cysteine activ. 


RI 001401"} 79 64 R ?74p-14 164- 
208 BL00140C 11 80 5 444e-10 
77-102 . 


120 


BL00224 


Clathrin light chain proteins. 


BL00224B 16.94 6.712e-10 95- 
148 


122 


BL00203 


Vertebrate metallothioneins proteins. 


BL00203 13.94 1 . 00 0e-40 16-62 


123 


PR00041 


CAMP RESPONSE ELEMENT 


PR00041D7.95 2.906e-0924-41 
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ID 

NO: 


accession 

r\ v^VxXL/OO lull 

NO. 


DESCRIPTION 


■ PT?QTTT TC* 






BINDING CCREB^ PROTEIN 
SIGNATURE 




124 


PR00041 


CAMP RESPONSE ET FMFNT 
BINDING (CREB) PROTEIN 
SIGNATURE 


PRnnn4i'n 7 9 qhap no 74.4 1 
r js\jv\j t + ixj j z.yuoe-uy Z4-41 


125 


BL00061 . 


Short-chain dehydrogenases/reductases 
family proteins. 


BL00061C 7.86 3.250e-10212- 
222 


126 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.400e-25 251-290 


127 


PR00318 


ALPHA G-PROTEIN (TRANSDUCIN) 


PR00318D 16.28 1.900e-34 219r 

740 PPfi.fl21 QT2 1/1 HQ "2 KC fl on 

Z45 rKUUjioo 14./y J.4_)je-Z/ 

168-191 PR00318C 12.09 7.000e- 

73 107-71^. PPO/HIRA 7 54 

1.600e-19 35-51 PR00318E7.23 

7 Sflflp-17 7fi*.-77^ 


128 


PR00927 


ADENINE NUCLEOTIDE 
TRANSLOCATOR 1 SIGNATURE 


PR00927E 14.93 9.743e-10 67-89 
PR00927B 14.66 4.575e-09 69-91 


130 


BL00824 


Elongation factor 1 beta/beta'/delta chain 
proteins. 


BL00824B 9.21 7.750e-22 133- 
153 


131 

ID 1 


JOidfvUOZH' 


ciongauon iacior i Dcia/Deia/ueJia cnain 
proteins. 


JD.LUU&Z4U 14.35 l.UUUe-40 loo- 

204 BL00824D 14.04 1.621e-38 

7fiA 730 PT nfW7j4P O 71 7 ?<A a 

22 133-153 BL00824E 12.49 
i nnnp.io 747 o£A 


132 


PR00209 


ALPHA/BETA GLIADIN FAMILY 
SIGNATURE 


PR00209B4.88 9.222e-13 1209- 
1228 ^ 


133 


PR00209 


ALPHA/BETA GLIADIN FAMILY 
SIGNATURE 


PR00209B 4.88 9.222e-13 1 168- 
1187 


134 


PR00708 


ALPHA-1-ACID GLYCOPROTEIN 

^TYTKrATTTPP 
olVJiN/v 1 Uixil 


PR007O8D 14.67 1.000e-27 141- 
100 rKUU/UoU 1 1./ / l.o4ie-2> 
98-120, PR00708B 15.15 2.1 74e- 

74 7*3 PPn.fi7fi.Bp 13 3*3 

1.600e-21 189-207 PR0070SA 

1 4 AO 7 A3£p Ol <1 7ft 


135 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGN ATT FRF 


PR00109B 12.27 8.468e-13 126- 

14< 


136 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 3.250e-10 201- 
217 


137 




oiiiaii uyiuKiiics ^lnicrcrinc/cncinoKinc^ 
C-x-C subfamily signat. 


PT fifi.47 1 7*3 Q7 7 AQCia 1 fi /IO OA 


140 






PPfiA7fKP 1 1 3Q ^ ^Q7» IA OOP 

346 PR00205B 11.39 9.01Se-10 
543-561 


141 


BL00412 


Neuromodulin (GAP-43) proteins. 


BL00412D 16.54 7.704e-09 976- 
1027 


143 


PR00979 


TAFAZZDSf SIGNATURE 


PR00979E 10.83 5.950e-26 192- 
214 PR00979A 11.91 8.773e-25 

£11 Stl PPfiAO*70/ w, n K /; Af\f\*± 1Q 

oj-oj rvtvuuy/yc iz. jo o.4UUe-iy 
108-124 PRO0979D 12.38 7.955e- 

TO 1 7fi TC< ■DPrtAOTOTT lfl M 

3.382e-15 230-244 PRQ0979B 
15.59 5.636e-15 94-106 


145 


DM00686 


kw REPLICATION REP 28K 17.7K. 


DM00686C 14.14 7.720e-09 111- 
131 


146. 


PR00604 


CLASS IA AND IB CYTOCHROME C 
SIGNATURE • 


PR00604TI 1 S Rfi 1 OOOp-17 R7- 
104 PR00604B 12.73 9.591e-l6 
57-73 PR00604C 10.21 8.200e-12 
73-84 PR00604E 10.13 1.000e-ll 
106-117 PR00604A 11.13 8.800e- 
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ID 

NO: 


A V- JC/O 0 1 1 H 

NO. 


DESCRIPTION 










1 1 44-52 PR00604F R 60 1 OOOp-. 
10 123-132 - 




RI 00107 


Pmtf*iri Winatp*; A'TP-hinrlino rpoirtn 

proteins. 


BL00107A 18 39 3 864e-l i S266- 
297 BL00107B13.316.143e-ll 
335-351 


148 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 


PD00289 9.97 8.448e-09 67-81 


149 


PR00069 


ALDO-KETO REDUCTASE 

oivjJN A 1 UJKJi 


PR00069D 19.36 1.857e-30 187- 

717 PP 00060 A 16 01 7 470a 7^ 
4 1 -66 PR 00069F 1 R 14^ 1 00p-77 

235-260 PR00069C 16.03 7.000e- 
20 151-169 PR00069B 11.33 
8.071e-19 101-120 




RT 00077 




RT 00077 76 43 7 6RRp-77 1 30-1 R7 


151 


PD02906 


SYNTHASE I PSEUDOURIDYLATE 
PSEUDOURIDINE LYASE TR. 


PD02906C 24.17 7.070e-22 165- 
200 PD02906B 15.35 8.393e-15 

114^177 PTi07006A 10R4 6^OOa 
09 71-84 


153 


BL00479 


Phorbol esters / diacylglycerol binding 

nomjiiri nmtPinQ 


BL00479A 19.86 5.091e-12 891- 
914 BL00479B 12 57 1 837e-1 1 

S X*T XJ X-i\J\J^ I J X-> lx>iJ f X . UJ / \J X X 

915-931 




RT 00077 


'TTrtmpnVifYY' HrmnniTi TimtPinc 
JL1 Will CtJUUA ULMliCUU piv/iciiio. 


RT 00077 76 43 6 7R6p-3 1 1 43-1 R6 


160 


BL00422 


Granins proteins. 


BL00422C 16.18 7.750e-12 420- 
448 


162 


PR00625 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00625A 12.84 9.297e-ll 62-82 


i fj\ 

J o*t 


RT 017R7 


13 Tit? rAnoot *viv\+aitic 


"RT 01 7R7R 30 40 6 1 R7p-1 0 347 

386 


1 66 


P"P 0O86O 


SIGNATURE 


PP00R60R 7 04 7 07Qp-70 R3-Q7 

PR00860A 5.46 1.000e-18 61-74' 
PR00860C9.61 L900e-1 5 97-107 


167 


PR00449 


TRANSFORMING PROTEIN P21 RAS 
SIGNATURE 


PR00449A 13.20 7.052e-09 196- ; 
218 


169 


BL00514 


Fibrinogen beta and gamma chains C- 
terminal domain proteins. 


BL00514C 17.41 1.346e-39 316- 
353 BL00514G 15.98 2.241e-34 
471-501 BL00514H 14.95 6.571e- 
27 510-535 BL00514E 14.28 
1.273e-16 388^405 BL00514D 
Djj y.iuue-ij 3oy-jo£ 
BL00514B 16.42 4.857e-14 260- 
276 BL00514F 1 1.65 9.690e-14 
416-431 BL00514A 11.68 8.200e- 
11149-159 


170 


BL00514 


Fibrinogen beta and gamma chains C- 
terminal domain proteins. 


BL00514C 17.41 1.346e-39 268- . 
305 BL00514G 15.98 2.241e-34 
.423-453 BL00514H 14.95 6.571e- 
27 462-487 BL00514E 14.28 
L273e-16 340-357 BL00514D 

I s i ^ o i nop. 1 S ^7 1 -3^4 

BL00514B 16.42 4.857e-14 212- 
228 BL00514F 11.65 9.690e-14 
368-383 BL00514A 11.68 8.200e- 

II 101-111 


171 


BL00514 


Fibrinogen beta and gamma chains C- 

terminal HfvmAin nrntpinc 

l-t/1 111 Ilia. 1 UUUlwlll piULClild. 


BL00514G 15.98 2.241e-34 385- 
41 S RT 00^5 14H 14 95 6 571e-27 
424-449 BL00514C 17.41 4.632e- 
24 230-267 BL00514E 14.28 
1.273e-16 302-319 BL00514D 
15.35 9.100e-15 283-296 
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ID 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








RT 00^ 1 4R 1 fk 47 4 JK7#» u on 

228 BL00514F 11.65 9.690e-14 
330-345 RLO0*514A 11 fiR R 9ft.flA 

11 101-111 


173 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 9.400e-29 1 19-162 


174 


.DM01970 


0 kw ZK632 12 YDR313C 
ENDOSOMAL EC. 


DM01 Q70R R 60 S 1 1 Op-1 ^ 1 *501 
1404 


176 


BL00773 


Chitinase*; familv 1 9 nroteins 


RT 0077^p 0 4? r nnop-no 7 i a 


182 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 9.163e-14 141- 
160 


183 


PD01937 


DNA PROTEIN POLYMERASE 
ENDONUCLEASE DNA- ' 


PD01937A 6.68 3.475e-09 221- ~~ 


185 


BL00845 


CAP-Gly domain proteins. 


BL00845 16.43 2.946e-23 247-272 

RT 00R4S 1 (\ 1 69 R*» 0 1 HY7 1 17 


186 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e-ll 525- 


187 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e-ll 497- 
j u 


188 


DM01803 


1 HERPESVIRUS GLYCOPROTEIN H. 


DM01803A 10.51 L000e-09 
1081-1102 


189 


PF00651 


BTB (also known as BR-C/Ttk) domain 
proteins. 


PF00651 15.00 5.091e-15 69-82 


190 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194C 6.38 1.900e-35 145- 
1/4 rK00iy41i 5.74 3.250e-30 
231-257 PR00194D9.57 1.500e- 
26 175-199 PR00194B 10.24 

^ Ofiflf* 7.4 1 OA 1 A 1 DPAAl CM A 

7.86 4 : 857e-21 84-102 






"TRrYM-QjT tt FT TO "FT FPTT? OXT 

TRANSPORT AROMATIC 
HYDROCARB 


rUUZU4Z±5 lo./D D.l-)4e-09 131- 
146 PD02042A 21.13 5.909e-09 

Q4-191 


193 


PR00021 


SMALL PROLINE-RICH PROTEIN 
SIGNATURE 


PR0002IA4.31 2.200e-102-15 


195 


BL00463 


Fungal Zn(2)-Cys(6) binuclear cluster 

Hnmain rvrntpinc 

UVJllLCLlIl JJ1 Uv/lllO. 


BL00463 8.22 5.071e-09 111-123 


196 


PR00118 


BETA-LACTAMASE CLASS A 
SIGNATURE 


PR00118F 16.42 9.386e-09 165- 

1 Rl 


197 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 5.424e-09 234- 

767 
ZO / 


198 • 


BL00660 


Band 4.1 family domain proteins. 


BL00660A 31.50 5.500e-ll 714- 
767 


199 


BL00282 


Kazal serine protease inhibitors family 
proteins. 


BL00282 16.88 8.820e-13 70-93 


202 


PR0000Q 


TVPP T VdV ^TfVWATTrRTJ 
1 I XrCf 1 HOP olOJN J\ X Uivti 


rXOUDUVA J 4. J 5 5~?45e-15 971- 
0C7 T>Ti f\(\(\(\Qr* 1/1110 Try a. \1 

996-1008 PR00009D 16.83 ■ 

O.UVUC-ll IVJUo-lUlo JTXvUUUUlrL^ 

14.11 1.882e-09 892-904 


203 


BL00025 


P-tvnf* 'TrpfniT H Amain nrntpinc 
X l j t/ J.l^JLuli UUILloiXl 




205 


BL00018 


EF-hand calcium-binding domain 
nroteins 


BL00018 7.41 7.300e-10 165-178 


206 


PR00168 


SLOW VOLTAGE-GATED 
POTASSIUM CHANNEL SIGNATURE 


PR00168D 12.88 6.865e-ll 67-86 


207 


RT 0009 S 


r -type i rciuii uomain proteins. 


JdI^UUUzj 1 /.l / 3.4zie-20 Jy-6U 
BL00025 17.17 8.750e-16 88-109 


209 


BL00646 


Ribosomal protein S13 proteins. 


BL00646B 21.42 6.100e-30 110- 
143 BL00646A25.82 6.192e-29 
14-62 


210 


PR00138 


MATRDON SIGNATURE 


PR00138D 16.56 3.605e-25 279- 
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iU5 rKUUlioU 10.41 j.UUUe-24 
218-247 PR00138E 6.01 8.714e- 
13 314-328 PR00138A 15.14 
9.538e-13 134-148 PR00138B 

1 ^ Q9 A <99*» 19 1 C8 O/Vi 


211 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 8.429e-12 386- 
406 DM01206B 10.69 1.247e-10 
384-404 DM01206B 10.69 
5.068e-10 388-408 


212 


PD01941 


TRANSMEMBRANE 
COTRANSPORTER SYMP. 


PD01941A 14.81 1.000e-40 163- 
217 PD01941B 15.02 9.705e-30 
420-467 PD01941E 15.92 8.714e- 

T2 C37 GSM X>T\C\1QAM^ ID Q/C 
0 9AAo 90 ^flfi ^£1 PT^niO/tlTi 

27.18 1.600e-16 661-710 
PD01941F 28.52 9.645e-15 1005- 

iuOU 


213 


BL00362 


Ribosomal protein S15 proteins. 


BL00362 24.67 8.313e-09 330-373 


214 


BL00115 


Eukaryotic RNA polymerase II 
nepiapepiiQe repeal proteins. 


BL001 152 3.12 2.125e-09 1 178- 

1 1ll T5T AA 1K71 T) £ f\Q£a AQ 

12.11 ulajkjv ioij j.iz o.uyoe-uy 
1164-1213 


215 


BL00038 


Myc-type, 'helix-loop-helix' dimerization 
domain proteins. 


BL00038B 16.97 7.600e-18 125- 

140 Jd1aKJU.5o/V Ij.oI 1.4/4e-IJ 

102-118 


916 


RT Al 1 C\$l 


iNjuosoniai protein j-.z4 proteins. 


diahiuoa zu.3j z.Z4ie-zz 4y-oz 
BL01108B 11.40 8.457e-10 96- 
107 


217 


PR00381 


KINESIN LIGHT CHAIN SIGNATURE 


PR00381A 9.55 1.321e-10 360- 
378 


222 


BL00514 . 


Fibrinogen beta and gamma chains C- 
terminal domain proteins. 


BL00514C 17.41 2.358e-26 1166- 
1203 BL00514G 15.98 9.000e-15 
lzey-1319 JdLUU514D 15.35 
6.936e-12 1207-1220 BL00514F 
1 1.65 4.288e-10 1253-1268 

T37 /VK 1ATJ TA ft A'ZZa 7 A 1 1 1 C 

Jt>i^uuji4ri 14. yj cojoe-iu idle- 
1343 


993 


r>T nn^9^ 

■DlAfUJZJ 


Acun-uepoiymerizmg protems. 


pt aai7^ti oi l n(\f\a ac\ qi 
oiajujZjJd zi.oo i.uuue-4U yj- 

T3Q RT ftA19^A 94 ftl Q lllf> 9A 

61-93 


224 


BL00018 


EF-hand calcium-binding domain 
proteins. 


BL0001S7.41 1.450e-l 0 231-244 


99S 


ptrni ^9Q 


Pterin A oItVIaq MrKinAlomina i*"1 7/^T"*i+r» n 

r icrin *t aipxia caruinoiaminc anyaraiase. 


DT7A1 19015 15 W 1 /TOa 10 £n 00 

rruijzyo 15. jj, i.oyze-i 0 0 /-yz 


228 


BL00211 


ABC transporters family proteins. 


BL00211B 13.37 6.250e-18 1033- 

1 f\&< TU HH9 1 1 P T5 17 0 57^ 0 1 Q 
1UOD Ol^UUZllD iJ.jf O.O/De-lo 

2045-2077 BL00211A 12.23 


230 


PR00761 


BIND IN PRECURSOR SIGNATURE 


PR00761A 5.81 9.366e-09 275- 
292 




"PRftYtfldQ 

rr\.UUVrr7 


SIGNATURE 


punnrt/ion n ah 1 <r»Ao in <o 

rK\J\)\)HyU U.UU j.DUUe-lU D4-oy 


TV) 




incuj omouuiui ^vjF/vr ~h j ) proteins. 


j3jluu4izi^ 10. ->4 i.y/oe-iu iuy- 
160 BL00412D 16.54 4.122e-09 

1 70.1 QA 


233 


BL0121O 


Caveolins protems. 


BL01210B 13.92 8.129e-09 106- 
156 


236 


BL00939 


Ribosomal protein Lie proteins. 


BL00939F 17.27 5.393e-09 861- 
891 


238 


BL01252 


Endogenous opioids neuropeptides 
precursors proteins. 


BL01252D 18.25 3.571e-28 205- 
233 BL01252B 19.09 5.034e-27 
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SEQ 
m 

NO: 


ACCESSION 


DESCRIPTION 


RESULTS* 






j 


j f-o/ duu izdzks io.iv x.ozie-zi 
164-190 BL01252A 14.22 7.107e- 
18 14-34 




BL00302 


ij\j.is.ai y liiiuaiiuii lawiur Jf\ JlYLHiolllC 

proteins. 


RT flAlftO 14 CI 1 A .ft.0,e»_/in T< 70. 
.DljUUJlrZ lH.51 1 .UUUc-*fU 4J-fy 


94.fi 


PP.00490 


A T? DM A TTP-R TN HYnP HWT A W 

(FLAVOPROTEIN 
MONOOXYGENASF^ STGNATTJRF 


PPOA47ftA 14 75 °.B^1o T2 0£ vIO 


241 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR L 


PD02929A 28.27 4.529e-09 235- 
289 


243 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NIL 


PD01066 19.43 8.527e-25 11-50 


944 


RT ni77fi 


Dana / protein iamuy proteins. 


.DJLIH2/UC lo.91 0.745e-17 115- 
144 BL01270B 18.74 6.857e-17 
76-115 BL01270E 13.03 6.016e- 
15 182-211 BL01270D 20.87 
9.160e-13 144-182 


94 S 




Domain present in ZO-1 and Unc5-like 
netrin receptors. 


FrUO/ylB 28.49 o.305e-12 253- 
308 PF00791B 28.49 1.909e-ll 
427-482 PF00791B 28.49 2.65 le- 
09 179-234 PF00791B 28.49 
3.890e-09 112-167 


ZfD 




rKU 1 EtiPi ZjIN L/-r IN Lje-K. Mb 1 AJL- 

BINDI. 


PDOOOoo 13.92 2.500e-13 277-290 
PD00066 13.92 9.143e-12 193-206 
PD00066 13.92 5.304e-ll 165-178 
PD00066 13.92 6.478e- 11 249-262 
PD00066 13.92 3.423e-10 221-234 


947 




Actins proteins. 


nr r\/\ a n/CT\ to CO £ /inn* on A£.c 

J3L0U406D 12.58 o.4U0e-20 465- 
520 BL00406B 5.47 4.857e-14 
249-304 BL00406E 8.44 l.OOOe- 
1 1 522-572 BL00406C 6.75 
5.449e-ll 313-368 


94R 




ER lumen protein retaining receptor 
proteins. 


±>L00951C 19.35 1.000e-40 112- 
161 BL00951A 15.10 7.750e-39 
21-57 BL00951D 13.94 6.000e-38 
161-196 BL00951B 14.23 3.100c- 
3157-88 


252 

• 


BL01113 


Clq domain proteins. 

• 


BL01 113 A- 17.99 9.129e-15 200- 
227 BL01I13A 17.99 4.818e-14 
194-221 BL01113A 17.99 7.818e- 
14 182-209 BL01113A 17.99 
1.730e-13 185-212 BL01113A 
17.99 6.595e-13 191-218 
BL01 1 13A 17.99 6.077e-12 203- 
230 BL01113A 17.99 9.1 82e- 11 
179-206 BL01113A 17.99 2.532e- 
10 170-203 BL01113A 17.99 
y.U4Je-lU 215-24.) rSLUllliA 
17.99 9.426e-10 209-236 

RT ftl 1 1 ^ A 17 Q0 4 1 Ko no m 

164 


257 


BL00845 . 


CAP-Gly domain proteins. 


BL00845 16.43 1.83.7e-21 466-491 


259* 


PR 00948 


TV/TTTAROTT? APIP f~ll TTTAJOfATT? 
iVXD L AJD \J 1 XVurlL ULU 1 /iiVl/V X JD 

GPCR SIGNATURE 


riCUU24oU 12.0 / 2.0ooe-09 33-7o 


260 


BL00678 


Tro-AsD ( WD} reDeat Droteins Droteins 


BL00678 ° 67 ?> 400e-1 0 441 -4S9 
BL00678 9.67 5.800e- 10 481-492 
BL00678 9.67 8.800e-10 358-369 


261 


BL00678 . 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 3.400e-10 415-426 
BL00678 9.67 5.800e-l 0455-466 
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SEQ 
ID 

NO* 
l ivy- 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








BL00678 9.67 8.800e-10 332-343 


969 






BL00678 9.67 5.800e-10 508-519 . 


263 . 


BL50002 


Src homology 3 (SH3) domain proteins 


BL50002B 15.18 2.200e-10 415- 

490 


264 


BL00049 


Ribosomal protein L 14 proteins. 


BL00049C 17.38 3.040e-12 94- 


265 


PDO 14-69 


GT YPOPPOTFTN PPOTPTM 
PRECURSOR SA. 


p~nn i am in *\Q i noi p.u/io 7n 

JTJLp/w 1*t\J7 aU.J7 Z.y7lC"I i f i fj(j"n/V 


266 


PD01469 


GLYCOPROTEIN PROTEIN 
PRECURSOR SA. 


PD01469 20.59 2.091e-14 279-31 1 


267 


BL00567 


Phosphoribulokinase proteins. 


BL00567A 10.66 1.161e-12 36-55 


9£Q ' 




Ribosomal protein LI 4 proteins. 


Jt>JL0vJU4yu 1 /.Jo 2.0ooe--£6 92- 
- loc tit aaaaqr i c /io £ oa/:- o/i 

54-86 BL00049A 13.86 8.333e-19 
129-140 


272 


BL01115 


G TP-binding nuclear protein ran proteins. 


BL01115A 10.22 9.735e-12 14-58 


273 


PR00021 


SMALL PROLINE-RICH PROTEIN 


PR00021A4.31 1.91 le-09 819- 

Q11 


275 


PR00179 


LIPOCALIN SIGNATURE 


PR00179B 9.56 2.895e-13 124- 
137 PR00179A 13.78 3.250e-ll . 
36-49 PR00179C 19.02 6.040e-ll 
154-170 


. zip 




SIGNATURE 


FR00449A 13.20 8.364e-17 22-44 
PR00449C 17.27 l.OOOe- 13 62-85 
PR00449E 13.50 4.000e-12 172- , 
195 PR00449B 14.34 5.680e-10 
45-62 


m 

jLi 1 


"DT ftAl /in 

i5LAa/14U 


Ubiquitin carboxyl- terminal hydrolase 
family 1 cysteine activ. 


£>1AHJ14UIJ 22.04 l.UUUe-4U 161- 
205 BL00140C 11.80 9.053e-30 . 

951 ^ 7^ RT nniidriR 19 A &AQa 

Zo j-jj Dl>UUI4Ur> iz.zr4.o4ye- 
17 37-55 


97R 




FT FMFKTT TP A"W^POQ A POP 
TP AN^Pn^ON TP AN^PO^ART F 


PT^fl9'7 1 9 A 9 *3 f\3 C H1 1 a AO A 1 CJ 
rU\JZ f LZJ\ Zj.KJj O.UijC*V7 4/-oj 


279 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 1.474e-09 100-111 


282 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 4.767e-21 864- 


283 


BL00048 


Protamine PI proteins. 


BL00048 6.39 9.550e-09 56-83 




PPOftfiRI 
rxvuuuo i 


fiT TTPO^P/RTRTTfYT 

DEHYDROGENASE FAMILY 
^TfrNFATTTPP 


PT?nnfi91A Ifl^Q 1 QlQa 11 1£. </l 

rJt\.uuu<5lA iU.jJ l.o/oe-Jil joon 


287 


PR00310 


ANTIPROLIFERATIVE PROTEIN 

■RTG1 FAMTT V ^TOMATTTPF 


PR00310B 10.59 4.23 le- 17 29-59 
pp firm rrno irt/\A9QA t^co iio 


289 


.PD01066 


PROTEIN ZINC FINGER ZINC- 

FfNJGFP TUFT AT -RTKTnTKfG MT T 


PD01066 19.43 7.000e-36 37-76 


293 


BL00979 


G-protein coupled receptors family 3 
proteins. . 


BL00979L 20.63 3.800e-12 111- 
152 


9Q^ 


PT*Jfl9 A 1 1 
rlJyZH 1 1 


PPPkTFTKT TT? AXTCr^P rPTT/"YM 

REGULATION NUCLEAR. 


Tyr^iVM 11 ti 00 i nAAii i c 1 nc 00 fi 

ruuz4ii zi.sy /.uuue-io lyo-zzy 


296 


BL01064 


Pyridoxamine S'-phosphate oxidase 
proteins. . 


BL01064A 27.84 8.313e-28 77- 
129 BL01064C 15.22 7.136e-25 

90.9-93 ^ 


297 


BL00030 


Eukaryotic RNA-binding region RNP-1 
proteins. 


BL00030A 14.39 2.929e-13 37-56 
BL00030B7.03 1.900e-ll 167- 
177 BL00030A 14.39 2.000e-10 
128-147 
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ID 

NO: 


ACCESSION 
NO. 


DESCRIPTION 


T5T70TTT T>Q* 


298 


BL01183 


ubiE/C0Q5 methyltransferase family 
proteins. 


BL01183B 21.31 6.660e-12 143- 
188 


299 


BL01279 


Prote in-L,- isoasDartaterD-asnartatf 1 ^ O- 
methyltransferase signa. 


BT 0127QA 94 97 <; R69p-1 1 ^7- 
1 05 


301 


BL00191 


Cytochrome b5 family, heme-binding n 
domain proteins. 


BL00191K 17.38 4.95 le-27 184- 
228 BT 00191 T 1 \ XI &AA1(>-\1 
128-150 


302 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 3.893e-16 33-67 


306 


PF01140 


Matrix protein (MA), pi 5. 


PF01140D 15.54 2.988e-09 416- 
451 


307 


PR00245 


OLFACTORY RECEPTOR 

<sT<nyT A TTTRF 


PR00245A 18.03 4.818e-21 59-81 
ppftft94<r* 7 c/i < tc^*. onoio 

9 S4 PP ftft94^n 1 ft 47 4 ftftftp 1 ^ 

274-286 PR00245B 10.38 8.200e- 
15 177-192 PR00245E 12.40 
5.714e-]2 291-306 


309 


BL00203 


Vpr+AVirfltp rriptallrttHirinPiiic nrntpinc 
V wltCUldLC ILlCUXLlULlilUlICLlio piULClIlb. 


RT ftft9ft^ 1 3 Q4 9 94Sp 1 ft 0 


310 . 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 7.632e-23 119- 
159 BL00237C 13.19 3.864e-15 
251-278 BL00237D 11.23 3.739e- 
12 312-329 


311 


. BL00380 


Rhodanese proteins. - 


BL00380D 15.90 8.200e-28 110- 
136 BL00380G 11.26 5.800e-16 
267-280 BL00380B 14.77 7.000e-. 

13 203-214 BL00380C 15.67 
7.387e-13 82-98 BL00380E 12.44 
7.000M1 181-193 BL00380A 
10.48 1.000e-09 10-20 . 


312 


BL00227 


Tubulin subunits alpha, beta, and gamma 
proteins. 


BL00227B 19.29 1.000e-40 50- 
105 BL00227C 25.48 i.000e-40 

111 1 £7 PT nft0O7rY 1 Q AA. 1 AAPin 

40 220-274 BL00227F 21.16 
1.000e-40 372-426. BL00227A 
24.55 3.250e-39 1-35 BL00227E 
24.15 8.500e-34 324-359 


327 


BL00232 


Cadherins extraceDular repeat proteins 
domain proteins.. 


BL00232B 32.79 7.362e-21 225- 
273 BL00232B 32.79 2.588e-17 
435-483 BL00232B 32.79 6.301e- 
15 116-164 BL00232B 32.79 
o. /oye-ij> jjv-j /o JtJJL/UUzjjiU 
10.65 9.341e-12 223-241 
BL00232C 10.65 5.696e-ll 328- 
346 BL00232C 10.65 3.942e-10 


329 


PD02749 


TRANSCRIPTION PROTEIN FACTOR 
BTF3 REGULATION NUCL.* 


PD02749B 12.75 2.241e-37 35-71 
PD02749C 13.96 4.892e-28 87- 
121 PD02749A 9.56 6.000e-15 2-. 
15 


• 330 


rj\ui/j7 1 


pwo^phatttyvt rMn^TTni 
TRANSFER PROTEIN SIGNATURE 


ppfiftQOiP n cmiQCfl koi i 
rKUUoyiJc, 1Z.DU /. /o^e-l j Zl 1- 

231 PR00391B 8.39 1.000e-13 

ft.4 PPftft^OI n 19 91 0 "iOfip 

13 191-207 PR00391A7.83 

S ^00p-1 1 1 

J.J7vC*J 1 1D"jO 


332 


BL01030 . 


RNA polymerases M / 15 Kd subunits 
proteins. 


BL01030 23.44 1.818e-23 87-125 


337 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.929e-32 6-45 


340 


PD02711 


SYNTHASE 


PD02711B 14.26 1.973e-20 944- 
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SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






PHOSPHOR1BOSYLFORMYLGLY. 


968 


343 


BL00223 


Annexins repeat proteins domain 
proteins. 


BL00223C 24.79 1 .000e-40 245- 
300 BL00223B 28.47 8.714e-38 . 
168-218 BL00223A 15.59 8.250e- 
27 98-132 BL00223A IS SQ 
8.750e-27 26-60 BL00223C 24.79 
9.438e-16 13-68 BL00223C 24.79 
2.735e-15 85-140 BL00223A 
15.59 2.253e-l 1258-292 


346 


PR00345 


STATHMIN FAMILY SIGNATURE 


PR00345B 7.12 2.800e-28 81-1 10 
PR00345F R S4 7 *Wp-9R I^R. 
183 PR00345C 4.54 9.100e-28 
110-134 PR00345D 10 97 1 964f>- 
24 134-158 PR00345A 13.46 
5 645e-16 52-71 


347 


BL00586 


Ribosomal protein LI 6 proteins. 


BL00586B 17.00 3.215e-15 184- 
221 


348 


PR00388 


S'^'-CYCLIC NUCLEOTIDE CLASS II 
PHOSPHODIESTERASE SIGNATURE 


PR00388A 10 45 2 778e-09 86- 
105 


351 


BL00018 


EF-hand calcium-binding domain 
proteins. 


BL00018 7.41 3.118e-ll 160-173 
BL00018 7.41 2.350e- 10 244-257 


354 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 L947e-09 256-267 


358 


DM01206 


CORONA VIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 3.278e-09 175- 
195 DM01206B 10.69 6.696e-09 

1R3-9fn HX/fftl 9ftffR lflAQ 
lOJ-Z.yjD JJlVlVlZUOD ixj.Oy 

8.633e-09 132-152 DM01206B 
10 69 R R61e-0Q 1R1-901 
DM01206B 10.69 9.316e-09 177- 
197 


361 


PD0149& 


OXIDASE BIOSYNTHESIS 
OXIDOREDUCTASE PORP. 


PD01498C 24.90 6.880e-14 219- 
263 


362 


PD01498 


OXIDASE BIOSYNTHESIS 
OXIDOREDUCTASE PORP. 


PD0149RC 74 QO 6 RR0p-14 91Q- 
263 


365 


BL00178 


Aminoacyl-transfer RNA synthetases 
class-I proteins. 


BL00178B7.il 1.000e-ll 589- . 
600 BL00178A 14.23 8.500e-09 
46-56 


366 


BL00523 


Sulfatases proteins. 


BL00523E 19.27 1.000e-23 318- 

34R *RT 0ftS9^ A 1 3 ^ ^AAa 1 £ 

30*47 BL00523B8.64 1.964e-13 

129-140 BL00523G 9 46 5 500e- 
10 506-516 


369 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.394.818e-0921-52 


370 


BL00880 


Acyl-CoA-binding protein. 


BL00880 17.52 1.000e-40 75-125 


371 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 1.000e-23 276- 
307 BL00107B 13.31 1.692e-12 


372 


PR00211 


GLUTELIN SIGNATURE 


PR00211B 0.86 6.602e-ll 326- 
347 PR0091 1R 0 Rfi 6 10fip-10 
320-341 PR00211B0.86 3.167e- 
09 333-354 


373 


BL00279 


Membrane attack complex components,/ 
perforin proteins. 


BL00279E 37.11 9.349e-10749- 
797 


375 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 1.231e-33 10-49 . 


377. 


- PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.563e-28 10-49 


379 


BL00598 


Chromo domain proteins. 


BL00598 14.45 5.781e-16 3-25 
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RESULTS* 


380 


PR00413 • 


HALO ACID 

DEHALOGENASE/EPOXIDE 
HYDROLASE FAMILY SIGNATURE 


PR00413D 11.28 8.941e-09 864- 
878 


383 


PR00413 


HALOACID 

DEHALOGENASE/EPOXIDE 
HYDROLASE FAMILY SIGNATURE 


PR00413D 1 1 28 8 Q41#>.n0 
878 


387 


BL01060 


Flagella transport protein fliP family 
proteins. 


BL01060A 15.65 1.535e-09 131- 
174 


388 


PR00209 


ALPHA/BETA GL1ADIN FAMILY 
SIGNATURE 


PR00209B 4.88 6.3 18e-l 1 1009- 
1028 


389 


PR00837 


ALLERGEN V5/TPX-1 FAMILY 
SIGNATURE 


PR00837B 11.64 LOOOe-10469- 
483 


391 


BL00240 . 


Receptor tyrosine kinase class III 
proteins. 


BL00240B 24.70 7.907e-10 1 18- 
142 


392 


PR00014 


FIBRONECTIN TYPE III REPEAT 
SIGNATURE 


PR00014D 12.04 8.412e-10 691- 
706 


393 


PR00014 


FIBRONECTIN TYPE IH REPEAT 
SIGNATURE 


PR00014D 19 04 8 419p-10 706- 
721 


394. 


BL01209 


LDL-receptor class A (LDLRA) domain 
proteins. 


BL01209 9.31 3.368e-15 47-60 
BL01209 9.31 5.500e-13 92-105 


395 


BL00634 


Ribosomal protein L30 proteins. 


BL00634 34.38 4.090e-13 70-121 


396 


RT 0101*3 


vA^&iciui-uiiiumg pruiciii idiiiiiy 

proteins. 


DLU1U1 ZO.ol o.UUUe-ZO jjO" 

402 BL01013A 25.14 7.231e-21 
45-81 BL01013C9.97 1.000e-13 
132-142 BL01013B 11.33 l.OOOe- 

11 11 \J 1X1 


397 ■: 


BL00930 


Peripherin / rom-1 proteins. 


BL00930E 17.80 1.000e-40 56-92 

RT OOO^On O 19 A A39p 37 19 

BL00930F 16.91 2.800e-36 92- 
133 


400 


PR00780 


LEUSERPIN 2 SIGNATURE 


PR00780B 4.89 4.491e-09 262- 
285 


401 


PR00819 


CBXX/CFQX SUPERFAMILY 
SIGNATURE 


PR00819B 10.83 7.158e-ll 4-20 


403 


BL003 81 


Endopeptidase Clp serine proteins. 


BL003S1C 23.84 1.250e-32 150- 
194 BL00381A 16.48 2.286e-22 

7d-1 11 "RT OOTR1R 91 AO ft T9fi*> 
/*+- ill oLAJyJ jo ID J. 1 ML o.jZDc- 

14 78-130 


405 


BL01105 


Ribosomal protein L35Ae proteins. 


RT 011 OS A 17^7 1 0O0f>-AO4~dQ 

BL01105B 12.95 1.000e-40 68- 
108 


406 


BL00344 


GATA-type zinc finger domain proteins. 


BL00344 17.99 7.000e-12 814-852 


407 


PR00211 


GLUTELIN SIGNATURE 


PR0021 IB 0.86 9.750e-09 73-94 


409 


PR00910 


LUTEOVIRUS ORF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 4.321e-09 9-22 


410 


BL00762 


WHEP-TKS domain nrntpinQ 


"RT 007A9A 01 A1 1 (\f\Cia OQ n^O 

789 BL00762A 23.43 4.400e-21 
903-940 BL00762A 23.43 5.415e- 

18 825-862 RT 007tf9R 1£ 14 
8.759e-12 1154-1168 


412 . 


BL00690 


DEAH-box subfamily ATP-dependent 
helicases proteins. 


BL00690B 13.38 5.320e-15 262- 
280 BL00690A 6.87 1.818e-13 
230-240 


415 


BL00227 


Tubulin subunits alpha, beta, and gamma 
proteins. ' 


BL00227B 19.29 L000e-40 52- 
107 BL00227C 25 48 1 000e-40 
113-165 BL00227D 18.46 l.OOOe- 
40 222-276 BL00227F 21.16 
1.000e-40 382-436 BL00227E 
24.15 1.750e-34 326-361 
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ACCESSION 

MA 
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RESULTS* 










416 


PF00992 


Troponin. 


PF00992A 16.67 1.71 le-09 557- 
592 


A 1 C 




Nuclear transition protein 1 proteins. 






rSlAJU D4 1 


Nuclear transition protein 1 proteins. 




420 


PF00856 


SET domain proteins. 


PF00856A 26.14 9.074e-13 901- 
ioo JrrUUoDOD 10.4zji.jy/e-I/ 
951-973 


/I7 1 


xiLUUo/o 


Trp-Asp (WD) repeat proteins proteins. 


"est nn.<70 o &n c onfu ion /m 


423 


nnni A<£ 


JrKiJlllUN ZJJNL, riJNLrJC/K 
PTWriPT? A/PPTAT PvTMTYTWf? XFT T 


PT^A.1 ft££ 1Q /17 C ^AAo OA T2A 1 £G 

ruuiuoo iy.4j o.ouue-3U 130109 


424 


PF00564 


Octicosapeptide repeat proteins. 


PF00564B 24.74 1.305*17 421- 
472 


4zo 


DP AAQBG 


T TO TTYTXTD VTXT A C"C CmXT A TT TP T7 

UKlL/UNii lsJ.JNAo.ti MvjJN A I UKr. 


rKuuyooA o.jy ^.Doye-iz j-zi 


427 


. PR009S8 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.569e-12 3-21 


428 - 


BL00478 


LIM domain proteins. 


TIT AAylTOTJ 1 A TA O OCA™ 1 O 1 1 C 

BLQ0478B 14.79 3.250e-13 115- 
130 BL00478B 14.79 9.036e-13 
50-65 


431 


BL00282 


Kazal serine protease inhibitors family 
proteins. 


BL00282 16.88 8.875e-12 464-487 


, 432 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 7.800e-18 316- 

■3 CT TJFVAAAOA A iTI C% £1 "7 _ i o 

357 rDU0930A 25.62 y.617e-12 
125-151 PD00930B 33.72 2.521e- 


433 


PD01066 


PROTEIN ZINC FINGER ZINC- 


PD01066 19.43 4.649e-34 34-73 


434 


PR00449 


TRANSFORMING PROTEIN P21 RAS 
SIGNATURE 


PR00449A 13.20 7.563e-l 1 56-78 


436 


PR00120 


H+-TRANSPORTING ATPASE 
(PROTON PUMP) SIGNATURE 


PR00120C 9.90 5.800e-19 705- 
722 


437 


BL00115 


Eukaryotic RNA polymerase II 
heptapeptide repeat proteins. .- 


BL00115T 8.45 7.273e-29 1208- 
1242 BL00115Q 18.08 2.776e-21 
953-983 BL00115Y 11.86 8.000e- 
17 1604-1650 BL00115M 19.19 
8.130e-16 731-774 BL00115H 
14.34 9.392e- 16 463-496 
BL001 15A 15.44 7.414e-15 43-82 
BL00115R 6.50 6.128e-14 983- 
. 1010 BL00115J 16.71 9.289e-14 
591-617 BL00115I8.33 4.336e- - 
13 535-590 BL00115L 12.25 
5.939e-13 662-694 BL00115G 
11.65 6.01 le-13 435-463 
BL001 15K 15.03 3.417e-10 617- 
659 BLQGl ISO 16.76 5.805e- 10 

Ol 3 DT AA.1 1 <P 1 1 </! 7 <OOa 

oOJ-yiJ dLUUI lOr 1 1.04 /..)3oe- 

10 913-953 BL00115S 18.24 

/.yOoc-lU 1U1U-1U3Z CLAJUllJU 

10.34 4.475e-09 1242-1265 


438 


PF00628 


PHD-finger. 


PF00628 15.84 4.536e-l 0 219-234 


440 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. . 


PD01066 19.43 6.35 le-34 10-49 


441 


PR00309 


ARRESTIN SIGNATURE 


PR00309A 9.68 5.250e-24 32-55 
PR00309D 7.09 4.938e-23 290- 

69-88 PR00309C8.22 1.621e-19 
165-183 PR00309E9.82 9.438e- 
15 374-389 


442 


BL00600 


Aminotransferases class-EQ pyridoxal- 


BL00600B 19.60 7.324e-14 103- 
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RESULTS* 






phosphate attachment si. 


129 BL00600G 12 43 2 12<5e-12 
306-325 BL00600F 8.77 8.105e- 
12 271-284 BL00600E 16.43 
3.167e-ll 228-257 BL00600D 
8.71 8.650e-09 207-221 


443 


BL00972 


Ubiquitin carboxyl-terminal hydrolases 
family 2 proteins. 


BL00972A 1 1.93 3.160e-18 69-87 


444 


BL00349 . 


CTF/NF-I proteins. 


BL00349A 10.07 1.000e-40 8-54 
BL00349C9.33 1.000e-40 82-125 
BL00349E 10.79 1.000e-40 152- 
195 BL00349F 11.81 1.000e-40 
213-255 BL00349H 15 70 7 387e- 
36 361-399 BL00349B 10.51 
2.227e-34 54-82 BL00349D 1 1 .70 
9 100e-34 125-152 BL00349G 
. 19.72 5.781e-30 323-356 


445 


BL00154 


E1-E2 ATPases phosphorylation site 
proteins. 


BL00154F 8 23 8 941e-21 271- 
295 BL001 54E 20.37 2.620e-15 
124-165 


. 448 


DM00215 


PROLINE-RICH PROTEIN 3. . 


DM00215 19.43 4.882e-l 1 82-115 
DM00215 19.43 6.492e-09 87-120 


451 


BL01283 


T-hox domain nrnt^in*; 


BT fll 1 JS3 A 94 1 5 3 1 flfip-40 1 1 ?- 

160 BL01283D 1 1.70 6.000e-39 
253-286 BL01283B 23.17 6.538e- 
38 170-212 BL01283C 13 05 
7.750e-19 222-236 


452 


PR00420 


AROMATIC-RING HYDROXYLASE 
(FLAVOPROTEIN 
MONOOXYGENASE^ SIGNATURE 


PR00420A 14.78 2.579e-ll 3-26 


453 


PR00162 


RIESKE 2FE-2S SUBUNIT 
SIGNATURE 


PR00162B 12.77 7.429e-17 215- 
228 PR00162A 9 3^ 2 324e-14 
193-205 PR00162C 8.10 7.120e- 
14 227-240 . 


454 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.000e-30 87-126 


456 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 9.333e-18 1149- 
1192 


457 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01 066 1 9 43 2 737e-94 1 fi-SS 


459 


BL00290 


Immunoglobulins and major 
histocompatibility complex proteins. 


BL00290A 20.89 1.529e-14 154- 
177 BL00290B 13 17 9 000e-12 ■ 

ill XJJ^WAi7UJJ U.l / 7.UUUC I/i 

214-232 - 


460 


PR00413 


HALOACID 

DEHALOGENASE/EPOXJDE 
HYDROLASE FAMILY SIGNATURE 


PR00413F 14 91 7 333e-ll 193- 

XXW/v'Ti.JX 1ti7J / ,JJ 1 1 XJJ 

214 PR00413E 15.78 5.714e-09 . 
175-192 


463 


PR00759 


BASIC PROTEASE (KUNITZ-TYPE) 
INHIBITOR FAMILY SIGNATURE 


PR00759B 11.26 8.3 85e-09 74-85 


466 


BL00019 


Actinia-type actin-binding domain 
proteins. ' - * 


BL00019D 15.33 4.200e-19 300- 
330 


467 


BL00019 


Actinin-type actin-binding domain 
proteins. 


BL00019D 15.33 4.200e-19 300- 
330 


469 


PR00153 


CYCLOPHJLIN PEPTIDYL-PROLYL 
CIS-TRANS ISOMERASE 
SIGNATURE 


PR001 53D 11 99 % 2S0e-l SSIQ. 
523 PR00153C 11.01 4.682e-14 
495-511 PR00153E 9 10 8 548e- 
14 523-539 PR00153B 11.57 
1.720e- 13 452-465 


470 


BL00491 


Aminopeptidase P and proline 
dipeptidase proteins. 


BL00491C 12.15 3.912e-09 557- 
572 


471 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 


PD00289 9.97 1 .000e- 14 1482- 
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PRESYNA. 


1496 PD00289 9.97 8.650e-l 1 
1122-1136 


474 


BL50040 


Elongation factor 1 gamma chain profile. 


BL50040D 17.41 1.000e-40 279- 
329 BL50040E 18.79 1.000e-40 

RT 50040F 15! 00 ^ 19/W 
40 3Q0-428 RT 50040C 99 69 
3.739e-38 141-184 BL50040B 
13.65 7.000e-30 59-85 BL50040A 
12.98 1.450e-14 10-22 


475 


BL01144 


Ribosomal protein L3 1 e proteins. 


BL01 144 25.07 1.000e-40 22-74 


476 


PR00007 


rOMPT FMFWT PI O DOM ATM 
SIGNATURE 


PR00007P 1 5 60 9 49 1 p-9 1 ^RQ 

611 PR00007B 14.16 3.500e-21 

544-5/*d PR00007A 10 11 8Q7#» 

20 517-544 PR00007D 9.64 
6 571 e- 12 623-634 


477 


BL50002 


Src homology 3 (SH3) domain proteins 
profile. 


BL50002A 14.19 5.846e-10 170- 
189 


479 


DM01970 


0kwZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 9.500e-17 967- 
980 


480 


PR00868 


DNA-POLYMERASE FAMILY A (POL 


PR00868C 13.76 5.688e-17 284- 

794-947 PR00RART4 19 ^1 7 7RR*> 

13 431-448 PR008681 10.87 
7 93 Re-11 469-476 PR00R6RF 

13.19 1.608e-10 340-366 


481 


BL00027 




RT 00097 96 43 0 1 R9p-99 


482 


BL00061 


Short-chain dehydrogenases/reductases 
family proteins. 


BL00061B 25.79 3.647e-21 188- 
226 


483 


RT 5000? 


Srr hnmnlocv 3 f^T-T3^ Hnmain TwvfpiTic 

01L> lL\Jlll\Jl\Jgy J yijxI.J J UUJJlaiXL piULClilO 

profile. 


RT 50009 A 14 10 1 750** 19 1079 

1051 


485 


PF00023 


Anlc repeat proteins. 


PF00023 A 1 6.03 9.625e- 1 0 760- 

776 ' PF00091 A 16 03 1 57 Ip 00 

715-731 


486 


PD02870 


RECEPTOR INTERLEUKIN-1 
PRECURSOR. 


PD02870B 18.83 9.262e-20 103- 
136 PD02870D 15.74 9.426e-09 
201-236 


487 


PR00370 


FLAVIN-CONTAINING 
MONOOXYGENASE (FMO) 
SIGNATURE 


PR00370G 10.45 3.769e-28 471- 
493 PR00370B 10.91 1.000e-24 
27-46 PR00370C 12.72 4.000e-21 

1 Af\ 1 DD nfl79fiT? I 1 Q< O OOOo 

21 320-339 PR00370D 16.33 

1 750p-90 1R5-904 PR00770F 
17 75 7 395e-90 375-^Q5 
PR00370A 3 35 2 038e-18 4-20 


489 


PD01675 


GLYCOPROTEIN MAJOR ENVELOPE 
PROBABLE U3. . 


PD01675C 19 89 2 330e-10 55-89 


492 


BL00211 


ABC transporters family proteins. 


BL0021 1A 12.23 5.050e-09 45-57 


493 


BL00211 


ABC transporters family proteins. 


BL00211A 12.23 5.050e-09 45-57 


494 


BL00211 


ARC* Iran snorter <; familv rtrotPinQ 


RT 0091 1 A 19 93 5 05ftp-0Q 5R-70 


495 


BL00027 


'Homeobox 1 domain proteins. 


BL00027 26.43 6.786e-12 509-552 
BL00027 26.43 9.143e-12 3 19-362 
BL00027 26.43 2.600e-l 1 627-670 
BL00027 26.43 3.625e-10 779-822 


497 


RT 00107 


proteins. 


'RT 001 07 A 1 R ^ COrio 99 9 14 

245 BL00107B 13 31 1 000e-13 
281-297 BL00107A 18.39 3.520e- 
13 583-614 BL00107B 13.31 
8.615e-12 652-668 


499 


BL00383 


Tyrosine specific protein phosphatases 


BL00383E 10.35 1.000e-14 1902- 
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proteins. 


1913 BL00383D11.92 3.077e-14 
1862-1875 BL00383A 13.34 - 

10.1.0 2.000e-13 1785-1796 
BL00383F 15.51 9.069e-12 1940- 
1956 BL00383B7.61 1.692e-ll 

1 /JJ-I /04 


501 


PR00019 


LEUCINE-RICH REPEAT . 
SIGNATURE . 


PR00019B 11.36 1.360e-09 136- 
150 PR00019A 11.19 1.667e-09 
91-105 PR00019Bll.36 4.600e- 
09 160-174 


503 


BL00226 


Intermediate filaments proteins. 


BL00226D 19.10 1.000e-40 367- 
414 BL00226B 23.86 6,143e-27 
195-243 BL00226A 12.77 7.840e- 
14 96-111 BL00226C 13.23 
2.600e-13 309-340 BL00226C 
13.23 0.143e- 12 266-297 
BL00226B 23.86 1.209e-09 146- 
194 


SOS 




'i-RTCPun^DunnT vpcd atc 
•D-DlornUorriVJOI^ Y L-JbKA 1 fc- 

rwnFPP>JTYF>JT PwnQPi-rnnT vptjp. 

•iiNJL/&rC»INJ-/IllN 1 rnUorflUUL I L*E,i\.. 


rD024U7r 7.61 o.739e-Q9 916- 
yju 


506 




T— J Inf ' 1 '.nAmom flir\inlit+i-r»— fr^nofaradAX 

rxD^ i -uumain ^uoiquiun-uansierasej. 


rruuo32u 2u.oo y.53ue-iy yyi- 

ifY53 1 8 4^ i i^^o ii 
luzj rri/uOjiD i.ijje-ii 

940-968 


507 


BL01082 . 


Ribosomal protein L7Ae proteins. 


BL010S2 20.37 4.273e-20 76-116 


508 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 2.421e-09 493-504 


509 


BL00678 


Trp-Asp (WD) repeat proteins proteins. . 


BL00678 9.67 2.421e-09 473-484 


510 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320B 12.19 4.774e-ll 567- 
582 PR00320B 12.19 5.886e-10 
763-778 PR00320C 13.01 6.760e- 
10 567-582 PR00320A 16.74 
7.618e-10 846-861 PR00320A 

1 H HA 1 A 1 Ca AO Tro T7P 

10.74 3.41 5e-09 763-778 
rssxjvjzvA io. /4 o.2ooe-uy 30 /- 

SR? 


511 


BL00479 


Phorbol esters / diacylglycerol binding 

Hnniafn nrntprnc 


BL00479C 12.01 3.250e-12 170- 


512 


BL50058 


G-protein gamma subunit profile. 


BL50058 27.23 7.494e-09 10-58 


513 




ouiriaiunicuui o uomain proteins. 


r>i-uuD24A y.OD o.y25e-l4 aU-101 


515 


BL00041 


Bacterial regulatory proteins, araC family 


BL00041 23.99 1.964e- 19 492-524 


516 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 8.500e-13 391-404 


517 


BL00415 


Synapsins proteins. 


BL00415E 4.82 9.291e-09 959- 

QQ£ 


518 


PR00109. 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 9.471e-12 126- 
145 


519 


PIT 009QO 


immunogioDuiins ana major 
histocompatibility complex proteins. 


"DT nnoonu io it a tca„ An a n £c 

]3lA)U2yU.D 13.17 4.750e-09 47-65 


522 




DNA METHYLTRANSFERASE 


DU nn^n^ a i a k t no« a c\ nCA 
rivULOUjA 14.13 7.12oe-Uy 364- 

381 ' 


525 


BL00312 


Glycophorin A proteins. 


BL00312B 9.22 5.781e-10 891- 
920 


528 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19 43 2 500e-32 16-55 


529 


PR00254 


NICOTINIC ACETYLCHOLINE 
RECEPTOR SIGNATURE 


PR00254D 15.50 4.000e-17 131- 
150 PR00254A 11.23 4.706e- 14 
61-78 PR00254C 11.36 4.000e-12 
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113-126 PR00254B 12.97 1.486&- 
1195-110 


531 


BL00741 


Guanine-nucleotide dissociation 
stimulators CDC24 family sign. 


BL00741B 14.27 6.870e-16 787- " " 
.810 


532 


PR00193 


MYOSIN HEAVY CHAIN ' 


PR00193D 14.36 3.143e-34 447- 
H/o rjvuuiyj^ iz.ou /.oj/e-Jz 
216-244 PR00193B 11.69 7.750e- 

7Q1A7 1 C3 PPfiftlOIA 1 ^ A 1 

2.588e-22 111-131 PR00193E 


533 


PD02870 


RECEPTOR MTERLEUKJN-1 
PRECURSOR. 


PD02R70R 1 R R1 S SQfip-ftQ 3zLft 

381 


535 


PR00683 


SPFCTRTN PT FrK^TRTW 

Ji Jlj\^> 1 XXJLi^l x l il A .IX ii x XVL1 1 

HOMOLOGY DOMAIN SIGNATURE 


484 


536 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 6.684e-24 164-207 


538 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMEKAL TAIL SIGNATURE 


PR00239E 1.58 2.739e-09 225- 
237 


539 


BL00406 


Actins proteins. 


BL00406C 6.75 L000e-40 157- 
III rSL0040oJB 5.47 Q.14^e-37 
90-145 BL00406D 12.58 4.600e- 

Its. 001 1AA DT nn^lAAT? 0 >M 
JO Zy 1-340 l3L>UU4UOIl 0.44 

2.200e-33 364-414 BL00406A 

Q OS 4 44.1f»-9"3 7-49 


540 


PR00456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 9.625e-10 44-59 


541, 


PR00456 


RIBOSOMAL PROTEIN P2 . 
^TGNATTIRF * 


PR00456E 3.06 9.625e-10 44-59 


542 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 7.857e-ll 138- 

1 S4 


544 


PF00642 


Zinc fmger C-x8-C-x5-C-x3-H type (and 
similar). 


PF00642 11.59 9.082e-10 838-849 


546 


BL00383 


Tyrosine specific protein phosphatases 
proteins. 


BL00383E 10.35 4.1 15e-10 104- 
115 


547 


BL01226 


Hydroxymethylglutaryl-coenzyme A 
synthase proteins. 


BL01226A 13.79 1.000e-40 50-89 
BL01226C 13.5.1 1.000e-40 127- 
167 BL01226D 11.60 1.000e-40 
174-210 BL01226E 13.74 l.OOOe- 
40 212-253 BL01226H 17.74 
. 1.000e-40 386-434 BL01226I 

Of A/C 1 f\f\f\A Af\ AC(\ CAO 

ZD.UO l.UUUe-4U 4oU.-50o 

BL01226G 15.76 3.4S3e-32 292- 

95-127 BL01226F9.78 8.714e-23 
253-271 


549 


BL00964 


Syndecans proteins. 


BL00964B 12.05 2.426e-10 1246- 

1 9RQ 


551 


DM01930 


YDR096W. 


fiMni ClfTE IS 41 1 *3/;7tt 37 1 7fi 

l^ivivisoujCi i.oo/eo/ i /u- 
215 DM01930F 14.16 8.232e-28 

9.163e-10 37-71 


552 


BL00195 


Glutaredoxin proteins. 


BL00195B 15.31 7.158e-09 9-29 


554 


BL00383 


Tyrosine specific protein phosphatases 


BL00383E 10.35 2.756e-12 436- 

AAH 
44 / 


555 


PR00403 


WW DOMAIN SIGNATURE 


PR00403B 12.19 7.612e-ll 122- 
137 PR00403A 16 823 Q12e-10 
107-121 PR00403B 12.19 2.068e- 
09 76-91 


558 


PR00380 


KDSTESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 2.714e-26 76-98 
PR00380D 9.93 3.000e-24 275- 
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226-245 PR0O380B 12.64 9.400e- 
20 195^213 


559 . 


BL00518 


Zinc finopr CIVfCA tvn<* fRfNIft fincrprl 

proteins. 


"RT OOS 1 R 19 93 ^ 333p no ^77 ^71 


561 


PD01795 


PROTEIN AMINOPEPTDDASE 
PRFCTTPQOR HYTYROT A<3F WIN A 


PD01795B 1 1.56 2.333e-12 159- 

179 PFkfn7Q^A 1ft 07 1 ftftAo t\Q 

135-144 


562 


PD01795 


PRECURSOR HYDROLASE SIGNA. 


Prtm 70^"R 1 1 o ^tja nun 

■T D\Jl /7JD 1 1 .JO Z.JJ jc- iZ 1 1 u- 

123 PD01795A 10.27 1.000e-09 


563 


BL00018 


EF-hand calcium-binding domain . 

protein 9 


BL000187.41 1.391e-09 41-54 


565 


BL00348 


p53 tumor antigen proteins. 


BL00348F23.194.143e-09 188- 
231 


567 


PD00301 


PROTEIN REPEAT MUSCLE 
CALCIUM-BI. 


PD00301B 5.49 4.1 15e-09 284- 
295 


569 


IfUUOJv 


nX5LOHc QcaCcLyiaSc iamny. 


PF00850D 14.76 1.519e-16 722- 

7A£ PT-fiftSKm- i ^ nc\ i 1 1 c» 1 1 
/*rO rruuoJUr ij./u i.iioe-ij. 

794-827 PF00850G 22.75 8.375e- 


570 


PD00289 


PROTEIN DOMATMRFPFAT 
PRESYNA. 


Pnflft95lO 0 Q7 A Qfiflp 1 ft 1 1.1 1^1 


571 


BL005 1 8 


7inc finfrpr CZVtPA tvnp fRTXIO finapr\ 

proteins. 


RT fin^ 1 SI 1 9 07 R gftft^ 1 1 zL4 ^1 
DLUU J 1 o IZ.ZJ o.ovUe-I 1 'Hr-jj 


573 


BL00299 


Ubiquitin domain proteins. 


BL00299 28.84 1.123e-ll 123-175 


574 


PF01140 


Matrix protein (MA), pl5. . 


PF01140D 15.54 3.700e-10 986- 
1021 


576 


BL00284 


Serpins proteins. 


BL00284C 28.56 5.200e-26 200- 

oL^UUZo4A 1 J.04 4.ylje-l5 

71-95 BL00284B 17.99 7.261e-15 

13JLUUZ04JJ 10.04 D. 54oe- 

13 306-333 BL00284E 19.15 

7 49Qp-19 7R7-419 


579 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.553e-29 15-54 


580 




*?tv* VlATYirtl Atrv 7 ^QH?^ Hrtmam T\rr»+Ainc 

iiuinuiogy z ^onzj uurnain proicins 
profile. 


RT ^nftfilTi 1*7 Ad A ^AAa IT 1 fl 1 rk 

1031 


581 


PD00930 

{ 


PROTFTN GTPA9F DOMAIN 
ACTIVATION. 


rjj\j\)yj\jD jo./z j.ioye-zz duo- 
649 PD00930A 25.62 6.806e-17 


584 


BL00612 


Osteonectin domain nrotein^ 


RT f)flfil9R 1 1 7*1 9 ft^4f»-l 1 01 

126 


585 


DM01551 . 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551C 14.62 8.859e-10 102- 
122 


586 


PF00628 


PHD-fTnger. 


PF00628 15.84 3. 455e- 12 235-250 


587 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 6.063e-10 85-128 


588 


PR00326 


vJ 1 a lf\JD\J \J 1 r -JO JJNJLJliN VJf JrJSXJ 1 HUN 

FAMILY SIGNATURE 


PPfini'XA P 7^ 7 <7^a 1/C77T 

248 PR00326C 9.79 6.760e-15 
z/o-zyz rKUUozou iy.uyo.oj/e- 
13 293-312 PR00326B 16.74 

0 990<» 1 1 9/155 


589 


BL00422 


Granins proteins. 


BL00422A 28.34 7.429e-09 2349- 
2378 


590 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.794e-10 295- 
339 


591 


BL00128 


Alpha-lactalbumin / lysozyme C proteins. 


BL00128A 20.76 3.423e-13 35-65 
BL00128C 19.34 2.980e-ll 110- 
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132 


596 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 3.136e-09.31-46 


597 


DM00547 


1 kw CHROMO BROMODOMAIN 
SHADOW GLOBAL. 


DM00547C 17.30 1.667e-19 207- 
229 DM00547E 13.94 6.200e-18 
319-342 DM00547B 11.28 

I. 000e-17 179-193 DM0Q547D 

I I. 60 9.250e- 13 289-303 

UlVLUUD4/r 2.5 A5 o. /2/e-lz O /9- 

726 DM00547A 12.38 4.818e-ll 


600 


PD01066 


PROTEIN ZINC FINGER ZINC- 

PTMfVPP TV/TPTAT UrNJTYTMri "Ml T 


PD01066 19.43 l.S82e-27 13-52 


601 


BL00192 


Cytochrome b/b6 heme-ligand proteins. 


BL00192A 11.90 6.400e-09 390- 


602 


BL00936 


Ribosomal protein L35 proteins. 


BL00936B 27.27 8.615e-09 1 18- 
157, 




r>T AAQQr 


Ribosomal protein L35 proteins. 


xJL0U9j6Jd 27.27 o\Ol5e-U9 118- 
157 


606 


PR00019 


LEUCINE-RICH REPEAT 

CTOXT A TT TP TJ 


PR00019B 11.36 7.300e-10292- 

*2 f\£. DD f\f\(\ IDA 11 IOC CCIa. fiO, 

jUo rKOUOiyA 11.19 5.00/e-U9 
323-337 


607 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 7300e-10 292- 
306 PR00019A 11.19 5.667e-09 
323-337 


608 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE . 


PR00320C 13.01 9.500e r 12 168- 
183 PR00320A 16.74 2.853e-10 
60-75 PR00320A 16.74 4.706e-10 
14-29 PR00320C 13.01 5.320e-10 
60-75 PR00320C 13:01 5.680e-10 
14-29 PR00320A 16.74 6.049e-09 
217-232 PR00320B 12.19 8.875e- 
09 168-183 


610 


BL00750 


Chaperonins TCP-1 proteins. 


BL00750B 16.17 1.000e-40 70- 
120 BL00750A 20.07 6.2 lle-37 
26-69 BL00750G20.12 8.800e-31 
431-471 BL00750F 18.40 5.125e- 
30 370-411 BL00750E 24.59 . 
8.650e-29 295-332 BL00750H 
21.44 1.000e-27 489-524 
BL00750C 25.65 5.345e-17 149- 
181 BL00750D 16.16 6.318e-14 
203-222 


613 


BL00766 


Tetrahydrofolate . 
dehydrogenase/cyclohydrolase proteins. 


BL00766B 24.49 1.000e-40 142- 

1 AA TST ftftTZ^r 1 "» TO 1 AAA— A A 

190 BL00766E 13.78 1.000e-40 
Jz2-3j9 JdLUU/OOC D.jUUe- 

39 208-256 BL00766D 17.05 
4.jjoe-zo zoj-j i j J5JLAJU/00A 
21.48 6.063e-24 102-132 




tit nfn^£ 


/iuipoKLQciic norinone xamiiy proteins. 


T5T AA7<£ 10 OS OOCo 1 fi 7/!£ 7^ 


616 


BL00319 


Amyloidogenic glycoprotein extracellular 
uurnain proieins. 


BL00319C 17.12 9.053e-09.419- 

hj5 


617 


BL00030 


Eukaryotic RNA-binding region RNP-1 
proiems. 


BL00030A 14.39 4.429e-09 44-63 


618 


BL00030 


Eukarvotic RMA-hrnHincrrpoin-n "RtsTP-l 
proteins. 




620 


BL00325 


Actin-depolymerizing proteins. 


BL00325B 21.66 5.817e-16 77- 
123 


622 


BL00972 


Ubiquitin carboxyl-terminal hydrolases 


BL00972A 11.93 5.500e-19 213- 
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SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






■familv 7 nrntpinc 
loll III j 4, ^JIULClLLa. 


711 RT nnG77Ti 77 <c 7 Tvto* \C 

2.5 i r>L,uuy I2.U 2.742e-lo 
501-526 BL00972B9.45 l.OOOe- 

3.1 60e-l 1 370-385 BL00972E 
20.72 7.5 17e- 10 526-548 


625 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.333e-39 6-45 






lJE,r\dJ~OOK bUuIalllliy A I r-QcpBllQcIlI 

helicases proteins. 


DuvvvDyU 21.0/ /. /jUe-il 47o- 
524 BL00039A 18.44 2.000e-25 
iqr-717 rt nnn^op l^-AI 1 QAA* 

tJO-jLj l DLAJ\J\)jy\^ 13. Oj 1.544e- 

15 327-351 BL00039B 19.19 

S fi^fip-ld 947-96R 


630 . 


PD0Q306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e-12232- 
246 


• 631 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e-12 290- 
304 


633 


BL00785 


5*-nucleotidase proteins. 

- 


BL00785C 9.45 3.625e-16 108- 

1 77 . dt nmotr i c oc a aaa« i c 
12.2. oJLUU/ojll ID.OJ 4.UUUe-lo 

279-295 BL00785A 9.73.6.500e- 
^A 70_jin "dt nn7C<n m . 

5.500e-13 72-86 BL00785D 9.89 

A (\C\C\t* 17 17^ "\AK 
H.UUwe-lZ IjD-I^D 


636 


PR00832 


PAXILLIN SIGNATURE 


PR00832E 14.43 9.901e-14 85- 

iUo 


- 637 


. PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 6.362e-13 221- - 
240 


UJ o 


rruuojj 


ivio-T ^tviajor sperm proiein ) domain 
proteins. 


T>I?nn£KT5 1 ^ HA A onna 1 1 A&1 

502 


639 


PR00860 


VERTEBRATE METALLOTHIONEIN 
SIGNATURE 


PR00860B 7.04 1.900e- 18 85-99 
PR00860C 9.61 1.474e-14 99-109 
PR00860A5.46 1.720e- 14 63-76 


641 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 4.462e-15 271-284. 

PD00066 13.92 4.462e-15 299-3 12 

PD00066 13.92 2.800e-14 327-340 

PD00066 13.92 2.800e-14 383-396 

PD00066 13.92 2.800e-14 41 1-424 

PD00066 13.92 7.000e-14 355-368 

PD00066 13.92 8.800e-14 439-452 

PD00066 13.92 8.800e- 14 495-508 

PD00066 13.92 1.500e-13 551-564 

PD00066 13.92 7.000e-13 467-480 

PD00066 13.92 7.000e-13 523-536 

rJJUUUOO \5.y2. y.DUUe-1 j 

PD00066 13.92 9.500e-13 243-256 

PD00066 13.92 9.500e-13 579-592 
pr^nnnA^ i ^ 07 r £i <** in <n7 £7n 

iUVJvUOO O.0lJC"JU OU/-0/U 

PD00066 13.92 1.600e-09 187-200 


642 


BL00961 


Ribosomal protein S28e proteins. 


BL0096 IB 1 1 .24 7.429e-37 67- 

i nn rt nno£i a o on a mo* i/c 
i uu DLfUuyo i a y.yu 4.u /yc-2,0 


643 


BL00585 


Ribosomal protein S5 proteins. 


BL00585A 28.43 1.391e-40 103- 
i ^ rt nn^SKR i c 7C 1 7<n* in 

193-230 


647 


BL00678 


ii^-rvajj ^ yvljj repeal pruicins proteins. 


*rt nn/^751 0 fn 0 Ann*» in isi 1 07 


648 


PR00876 


NEMATODE METALLOTHIONEIN 
SIGNATURE 


PR00876C 6.15 9.229e-09 112- 
126 


652 


PD01066 . 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 5.941e-27 29-68 


653 


BL00047 


Histone H4 proteins. 


BL00047A 13.53 1.000e-40 2-41 
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SEQ 
m 

NO: 


ACCESSION 
NO 


DESCRIPTION 


RESULTS* 








RT 00047R 6 511 429e-40 4.1-74 
BL00047C 12 18 1 310e-38 74- 
104 


654 


PD01066 


PROTEIN ZINC FINGER ZINC- 

X 1W 1 Julll lull i V ± Ul VJJL^IA, iuil<l 

FINGER METAL-BINDING NU. 


PD01066 19 43 4 109e-25 30-69 


655 


BL01115 


G TP-binding nuclear protein ran proteins. 


BL01 1 15A 10.22 3.4S3e-17 19-63 


657 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 8.286e- 10 31-40 


UJO 


rt nm9^ 


Q Ai*m a /t^r a/mi t n o ono^ifi/> nfA^om 

ocrine/inrconinc specinc proiem 
phosphatases proteins. 


135 BL00125C 19.97 1.000e-40 
153-200 BL00125D33.il l.OOOe- 
40 213-268 BL00125A 14.83 
8.941e-3S 47-84 


659 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 8.200e-16 492-505 
PD00066 13.92 9.308e-15.380-393 
PD00066 13.92 6.000e-13 352-365 
Jrjjuuuoo ij.yz /.uuue-jo z^mj-xdj) 
PD00066 13.92 7.500e-l 3 268-281 
ru\j\j\)0\j ij.yj, /.juuc-ij wo-hzi 
PD00066 13.92 2.174e-ll 464-477 
PD00066 13.92 l.OOOe- 10 436-449 


660 


. PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.189e-26 29-68 


661 . 


BL00795 


Involucrin proteins. - 


BL00795C 17.06 7.882e-15 193- 
238 BL00795.C 17.06 3.797e-13 
187-232 BL00795C 17.06 5.014e- 
13 188-233 BL00795C 17.06 
4.506e-12 196-241 BL00795C 
17.06 7.896e-12 191-236 
BL00795C 17.06 1.667e-ll 185- 

230 BL00795C 17.06 2.000e-ll 
198-243 BL00795C 17.06 3.778e- 
11 171-216 BL00795C 17.06 
6.1Ile-ll 197-242 BL00795C 
17.06 6.444e-ll 194-239 
BL00795C 17.06 8.000e-ll 189- 
234 BL00795C 17.06 8.556e-ll 
192-237 BL00795C 17.06 1.733e- 
10 195-240 BL00795C 17.06 
2.779e-10 184-229 BL00795C 
17.06'4.035e-10 199-244 
BL00795C 17.06 5.081e-10 186- 

231 BL00795C 17,06 6.965e-10 
190-235 BL00795C17.062.700e- 
09200-245 BL00795C 17.06 
5.800e-09 175-220 BL00795C 

l /.uo o.juue-uy ioz-zz/ 

RT fifi7Q^P 17 Cifk A Afiflp-fiQ 001 

246 BL00795C 17.06 6.600e-09 

707-947 RT 0070^P 1 7 OA A Aftfip- 
09 208-253 


662 


BL00469 


"Wiirlftn*iiflft rfitVhnQnhntp Vinsmpc nrntpinc 


RT 004A9 7? 11 1 fifiOp-40 14Q-70d 


663 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 19.54 9.411e-ll 331- 


664 


BL00601 


Tryptophan pentad repeat proteins (IRF 
family) proteins. 


BL00601A 20.29 5.500e-23 7-46 
BL00601B 20.92 3.631e-13 69-98 


665 


BL00082 


Extradiol ring-cleavage dioxygenases 
proteins. 


BL00082A 19.07 8.6 15e- 12 49-72 


666 


DM01537 


lew SKI2W SKI2 NUCLEOLAR 


DM01537B 21.63 4.073e-37 834- 
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SEQ 

NO* 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






HELICASE. 


881 DM01537B 21.63 9.750e-21 
1669-1716 DM01537A 15.14 
8.650e-l 8 698-718 DM01537A 
15.14 6.766e-12 1537-1557 


oo / 


JL71V1U IJj / 


lav ^l<rT?W «!lfn NTIPT FOT AT? 

HELICASE. 


T>M01 S17R 21 63 7 973p-3R R70- 

867 DM01537B 21.63 9.750e-21 
1655-1702 DM01537A 15.14 
8 650e-18 684-704 DM01 ^7A 
15 14 6 766e-12 1523-1543 


669 


BL00107 


Protein kinases ATP-bindine region 
proteins. 


BL00107A 18 39 6 786e-24 849- 
880 BL00107B 13.31 6.727e-13 
916-932 


670 


BL00299 


Ubifluitin domain Droteins 


BL00299 28 84*9 735e-27 37-89 


671 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 6.571 e- 12 432-475 


676 


PR00861 


ALPHA-LYTIC ENDOPEPTEDASE 
SERINE PROTEASE (S2A) 
SIGNATURE 


PR00861E 9.88 2.385e-09 206- 
,221 


678 

7 


BL00225 


Crystallins beta and gamma 'Greek key 1 
motif proteins. 


BL00225B 18.06 7.5 17e-24 1805- 
1840 BL00225B 18.06 8.297e-20 
1987-2022 BL00225B 18.06 
2.575e-19 1896-1931 BL00225B 
18.06 8.200e-19 175-210 
BL00225B 18.06 8.200e-19 1698- 
1733 BL00225B 18.06 4.808e-14 
73-108 BL00225B 18.06 4.808e- 

14 1 TAT 0077^R 1 R 0£ 

5.500e-14 2077-21 12 BL00225A 
13.82 5.829e-12 2043-2064 

RT 007? S A n R? 3 177p-0Q 17*10- 

1780 


679 




fi-PP OTPT>J RFTA WTV40 RFPPAT 

yj'JTJSXJ x J2iLX\ J->J_< X I\ WLrnU ivJurcn. 1 

SIGNATURE * 


PP00370P 13 01 A?40p-10 16Q- 

184 PR00320A 16.74 6.294e-10 
169-184 


680 


BL00243 


Integrins beta chain cysteine-rich domain 
proteins. 


BL002431 31.77 1.143e-ll 172- 
215 


681 


PR00852 


XERODERMA PIGMENTOSUM 
GROUP D PROTEIN SIGNATURE 


PR00852H 5.90 1.000e-29 612- 
635 PR00852E8.14 3.769e-27 
348-371 PR00852DlL38 8.875e- 
27 309-331 PR00852B 11.08 * 
2.800e-25 249-269 PR00852I 

17 7 A 3 S00p-7^ AR3-70A 
PP00R^7F 1 1 R^ S 000p-74 370- 

398 PR00852G 16.19 4.462e-23 
468-486 PR00852C8.81 9.143e- 
23 284-303 


682 


BL50058 


G-protein gamma subunit profile. 


BL50058 27.23 1.375e-35 15-63 




PIT 0007? 


UUlljUlim L>ai UKJ AJr 1~ LCI llillial XljrLUUladCo 

family 2 proteins. 


RT 00077 A 1 1 0*5 7 ^00p-70 4.0-^R 

BL00972D 22.55 3.903e-16 300- 

37S RT 00Q77R 0 zLS 1 000p-13 

120-130 BL00972E 20.72 5.500e- 
11 325-347 


687 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 4.273e-14 98- 
138 


DOo 


*RT Ofl^RR 
DLUUjOO 


jrroreasonic /v-iype su duuhs proicins. 


RT fW2RRA 7*3 Id 1 ftftfto Aft C ^4 

BL00388B 31 38 3 864e-33 66- 
108 BL0038SD 20.71 1.000e-21 
153-184 BL00388C 18.79 8.147e- 
16 126-148 


689 


PD02796 


PROTEIN STEROL CARRIER LIPID- 


PD02796B 20.92 1.10Se-15 347- 
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SEQ 
ED 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






TRAN. 


394 


691 


PD01572 


PHOTOSYSTEM II REACTION 
CENTRE T PROTEIN PHOTOS. 


PD01572 8:77 4.083e-09 1-31 


692 


BL00028 


Zinc finger, C2H2 type, domain proteins. 


BL00028 16.07 7.600e-10 488-505 




■rt oi 


fiwct"Pr/\1»Hi'nr1in O" TM*f\+pin fomi 1\/ 
WAjfDlOICH-UUJUlllg piULCIIl lolilliy 

proteins: 


563 BL01013D 26.81 8.235e-23 
814-858 BL01013C9.97 6.2lie- 
14 615-625 BL01013B 11.33 

3 fiflSp-13 SQ9-fifH 

j.OUJC-Ij DyJ.~Q\JJ 


695 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 


PD00289 9.97 3.571e-13 164-178 
PD00289 9.97 8.650e-ll 2147- 
9ifii pr>nn?Ro o Q7 9 no 9^ 

<£1U1 aJJUV^O? 7.7/ j^.JJZC"u7 

37 


698 


PR00161 


NICKEL-DEPENDENT 
HYDROGENASE/B-TYPE 
CYTOCHROME SIGNATURE 


PR00161C 9.51 4.930e-09 282- 
302 


700 


. PR00749 


LYSOZYME G SIGNATURE 


PR00749F 13.63 8.636e-13 139- 
ljo rxvUU /4yrl o.Zz j.Ooie-1/ 
173-194 PR00749B 16.54 1.41 9e- 
1 1 48-70 PR00749C 7.26 3.060e- 
11 72-91 PR00749A 10.33 
4.815e-10 24-45 


703 


PR00704 


CALPAIN CYSTEINE PROTEASE (C2) 
FAMILY SIGNATURE 


PR00704I 9.52 1.000e-29 476-505 
PR00704D 1 1 .05 2.500e-27 132- 
158 PR00704E 12.55 5.500e-27 
162-186 PR00704F 13.61 l.OOOe- 
22 187-215 PR00704G 13.87 
1.237C-21 317-339 PR00704H 

PR00704A 14.68 2.125e-19 27-51 

PT?fWY7fYAP 1 1 Sfi 1 9S7<» 1 7 OA 

113 PR00704B 17.94 1.833e-15 
72-95 


705 


PR00859 


PROKARYOTE METALLOTHIONEIN 
SIGNATURE 


PR00859C 7.06 2.776e-09 9.4-1 1 1 




•DT Aft99fi 


iiiLwiiiicaiaLc inajiienis proieins. 


oJLUUzzou ly.iu y.joie-Zo joy- 
416 BL00226B 23.86 3.250e-24 

21 268-299 BL00226A 12.77 


707 


: PR00021 


SMALL PROLINE-RICH PROTEIN 
SIGNATURF 


PR00021A4.31 2.440e-102-15 


708 


BL00361 


Ribosomal protein S10 proteins. 


BL00361B 18.34 5.101e-10 82- 
105 


709 . 


PR00021 


SMALL PROLINE-RICH PROTEIN 
SIGNATURE 


PR00021A4.31 2.200e-10 2-15 


710 


BL00514 


Fibrinogen beta and gamma chains C- 

tArtnmol n/MDOin nrnfainc 

icrminai uouiam prone ins. 


BL00514C 17.41 8.412e-27 160- 
itv i>i-.uuji4jfci 14.Z0 o.yuye-jo 
219-236 BL00514H 14.95 1.551e- 
1^^17-149 tk\ nnsi 1 S OR 

7.750e-15 284-314 BL00514D 
15.35 4.789e-10 201-214 


711 


PD00930 


PROTEIN GTPASE DOMAIN 

A PTTV A TTHM 


PD00930B 33.72 8.714e-12 49-90 


714 


BL00400 


LBP / BPI / CETP fairulv nroteins 


RL00400P 94 SI fi 099^-17 1 SR- 
202 BL00400D 23.26 2.080e-14 
222-259 BL00400A 2 1 .59 1 .600e- 
10 27-59 


715 


BL01154 


RNA polymerases L / 13 to 16 Kd 


BL01154B 24.55 5.500e-36 40-76 
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SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






subunits proteins. 


BL01154A 18.70 3.000e-22 19-40 


716 


pno 10 66 


FINGER METAL-BINDING NU. 


PnniO££ 10 £% Q 7Rfip 30 1fi_/lQ 


717 


BL00215 


A4itrkfTif\nHr4fll f»n^r&v h*nncfi*r nrrktptnc 
lYXlLUUliLrjlUi JOi cucigy u oHMd pjULCiliD. 


RT flf!7 1 ^ A 1 ^ R7 Q ?0£a 1 A 77 

102 BL00215A 15.82 8.412e-10 
175-200 


719 


. BL00309 


Vertebrate galactoside-binding lectin 
proteins. 


BL00309C 18.65 2241e-09 62-87 


726 


BL00687 


Aldehyde dehydrogenases glutamic acid 

nrntmnc 

pruicins. 


BL00687E 25.37 7.136e-33 266- 
jio jdjuuuoo /lj zo.uu j.jsjje-zo 
151-198 BL00687B 17.54 3.647e- 
26 39-81 BL00687C 24.13 

2.500e-ll 352-363 


727 * 


DM0 1 ^ 


ORP2. 


jjiviui j>dhjn ij.i/ i.uuue-4u izy- 
174 DM01354O 8.73 6.605e-15 
180-226 


734 


PD00301 


PROTEIN REPEAT MUSCLE 
CALCIUM-BI. 


PD00301A 10.24 6.400e-09 101- 
112 


735 


BL01024 


Protein phosphatase 2A regulatory 
subunit PR55 proteins. 


BL01024A 10.26 1.000e-40 22-69 
BL01024B 8.91 1.000e-40 86-127 
BL01024C 7.80 1.000e-40 146- 
185 BL01024D 13.22 L000e-40 
185-222 BL01024Ell.961.000e- 
40 222-200 BL01024F 9.42 

I. 000e-40 266-317 BL01024G 

II. l.UUUe-4U J 1 /-JHy 

BL01024H 13.88 1.000e-40 389- 


- 736 


PF00913 


Trypan osome variant surface 

o 1 v r- fxrvrc\t& l t*» 


PF00913D 11.90 7.130e-10 24-51 


737 


PR00700 


PROTEIN TYROSINE PHOSPHATASE 
SIGNATURE 


PR00700D 12.47 2J200e-09 82- . 
101 


740 




\J~rxvv 1 CLIN JDH in W U-H v IvEJT XI A 1 

SIGNATURE 


irivuujjtUi^ id.ui i.ouue-uy oo-oj 
PR00320A 16.74 7.366e-09 68-83 


743 


P"R 00871 


NUCLEOTTDYLEXOTRANSFERASE 
(TDT) SIGNATURE 


PRfi/lft71f^ 14/15 C ftfiHfi* fiO 17R 

201 


745 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 2.286e-10 33-42 


749 


BL00215 


Mitochondrial energy transfer proteins: 


BL00215A 15.82 5.200e-15 221- 
246 BL00215A 15.82 7.618e-14 

Of\-A< RT fifi*7 1 < A 1 < CO Q C<1 a 11 
ZU-hD JdJLUUz IDA ID. 62 o.oDie-1 1 

123-148 BL00215B 10.44 9.526e- 
11 69-82 BL00215B 10.44 
/.Duue-uy z/z-zoj jji^uuZJ jjcj 
10.44 8.500e-09 165-178 


751 


RT SO 00? 


Ci»a Vinmnln'trv "X f Hrvmain nrAfpin c 
Ol \/ llVJlllUlUgj D ^OIlj y LLUlUdU] pruicmo 

profile. 


RT ^fiftfi.7 A 1/110 1 fifift» 1 A 17fi 

jdjlouuuza 14. iy i.uuue-14 ^ /u- 
389 BL50002B 15.18 2.200e-10 

AfiR_A77 


752 


BL00353 


HMG1/2 nroteins 


RT f»fl^S^R 11 47 ^ ORQp-17 ^Qfl- 
440 


753 


PF00622 


Domain in SPIa and the Ryanodine 
Receptor. 


PF00622B 21.00 4.214e-14 47-69 


754 


BL00211 


ABC transporters family proteins. 


BL0021 1A 12 23 8 941 e- 10 66-78 


755 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926T 17.75 7.750e-19 392- 
415 PR00926C 16.07 5.935e-17 
253-274 PR00926D 10.53 2.059e- 
15 301-320 PR00926E 11.70 
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NO. 
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RESULTS* 






■ 


4.971e-15 344-363 PR00926B 
16.07 9.526e-13 210-225 
PR00926A 10.41 1.514e-12 197- 
211 


756 " 


BL01187 


Calcium-binding EGF-like domain 
proteins pattern proteins. 


BL01 187A 9.98 2.125e-12 324- 
336 BL01187A 9.98 4.789e-l 1 
377-389 BL01I87B 12.04 3. 057e- 


757 


PF00651 


BTB (also known as BR-C/Ttk) domain 
.proteins. 


PF00651 15.00 4.429e-10 43-56 


758 


PR00055 


HIV TAT DOMAIN SIGNATURE 


PR00055A 8.13 8.855e-09 144- 
156 


.759 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 5.304e-ll 110-123 


760 


PR00448 


NSF ATTACHMENT PROTEIN 
SIGNATURE 


T1T1 Art A A OT~\ 1 *"| A*"S *> A C e ^ n 

PR00448D 12.42 3.455e-27 162- 

186 PR00448A 10.74 1.273e-22 

37-57 PR00448B 16.01 9.379e-21 
inn iic DDfiA./Mo/"' n ac i aaa** 

20 129-147 


765 


BL01042 


Homoserine dehydrogenase proteins. 


BL01042A 13.29 5.909e-ll 74-95 


766 


PR00625 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00625A 12.84 2.154e-18 26-46 
PR00625B 13.48 9.000e-16 57-78 


768 


BL00762 


WHEP-TRS domain proteins. 


BL00762A 23.43 8.500e-28 112- 
149 BL00762B 16.14 3.793e-12 
64-78 BL00762A 23.43 6.625e- 12 
6-43 BL00762C 15.58 4.176e-09 
459-472 BL00762D 11.15 9.667e- 
09 210-220 


769 


PR00709 


AVIDIN SIGNATURE 


PR00709A4.60 1.934e-09 1-20 


770 


PR00320 


. G-PROTEIN BETA WD-40 REPEAT 

. C T /""''v. T A Tl TT» T" 1 

SIGNATURE 


PR00320C 13.01 1.720e-10 262- 
277 PR00320A 16.74 2.853e-10 ; 
262-277 PR00320C 13.01 4.300e- 
09 96-111 PR00320B 12.19 

C CAA«* AA T7*7 TITiAAOOA A 

5.500e-09 262-277 FR00320A 
16.74 6.268e-09 55-70 


III 


TJT5 AAA 1 O 


LbU UJJNJb-KiUxl KtrxA 1 

SIGNATURE 


TYD AAA 1 AO 11 1 £L O T1 1 *t OHf 

rK00U19J3 11.36 8.714e-12 87- - 
101 PR00019A 11.19 1.000e-10 

AA 1AM 

90-104 


772 


PD02807 


APOLIPOPROTEIN E PRECURSOR 
APO-E GLYCOPROTEIN PLAS. 


PD02807C 8.91 6.308e-10 110- 
159 


773 


PD02807 


APOLIPOPROTEIN E PRECURSOR 
APO-E GLYCOPROTEIN PLAS. 


PD02807C 8.91 6.308e-10 155- 
204 


774 


DM00547 


1 kw CHROMO BROMODOMAIN 
SHADOW GLOBAL. 


DM00547F 23.43 3.942e-28 943- 
990 DM00547E 13.94 9.750e-21 
652-675 DM00547B 11.28 
1.818e-l 8 518-532 DM00547C 
17.30 3.53 le-17 546-568 
DM00547A 12.38 1.273e-ll 497- 

rnn T"Y\jfAA<>t*7T\ 11 Cf\ A 1AA-. 1 1 

509 DM00M7D 1 1 .60 9.200e- 1 1 
622-636 


776 


PR00779 


INOSITOL 1,4,5-TRISPHOSPHATE- 

"DrKTTMKm "D"D fYPTZ'TM BUPCDTAD 

xilJNJL/irMVj rKUlzvlJN Kri.Cii.rl UK 
SIGNATURE 


PR00779F 14.51 5.147e-09 769- 
792 


111 . 


PR00779 


INOSITOL 1,4,5-TRISPHOSPHATE- 
RTMnTMn PTiOTFTNT PPPPPTTVR 

SIGNATURE 


PR00779F 14.51 5.147e-09 742- 

/U_> 


IIS 


PR00779 


INOSITOL 1,4,5-TRISPHOSPHATE- 
BINDING PROTEIN RECEPTOR 
SIGNATURE 


PR00779F 14.51 5.147e-09 742- 
765 
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SEQ 
ID 
NO: 


ACCESSION 
NO. , 


DESCRIPTION 


RESULTS* 


779 


BL01282 


BIR repeat proteins . 


BL01282B 30.49 2.543e-09 6-45 


781 


PR00205 




672 PR00205B 1 1.39 8.588e-l J 
230-248 PR002G5B 11.39 8.527e- 
10 551-569 PR00205B 11.39 
4.203e-09 336-354 


783 


BL00625 


Regulator of chromosome condensation 
(RCC1) proteins. 


BL00625B 17.69 2.167e-19 193- 
227 BL00625A 16.21 5.500e-17 
199-228 BL00625B 17.69 l.S85e- 

10 I4U-W4 tsJLrUUozjJD 1 /.o9 

2.770e-16 245-279 BL00625A 

JO.Zl 7. 1 1 JC" lO 

BL00625A 16.21 6.507e-14 146- 

17S 
i / J 


785 


PF00084 


Sushi domain proteins (SCR repeat 

rtrnti*iric 


PF00084B 9.45 7.1S8e-10 595-607 

PPA.aao,it> 0 AS A ACi(\a. AO ££Q 


786 


PF00084 


Sushi domain proteins (SCR repeat 


PF00084B 9.45 7.188e-10 595-607 
rruyuono y.Hj o,*fuue-u" odo-ooo 


787 


BL00826 


MARCKS family proteins. 


BL00826C7.63 6.738e-09203- 
230 


788 


PR004S3 


VON WTT T PRP ATsJT* "PAPTfYP TVPTJ 

A DOMAIN SIGNATURE 


"DT_> AA/1 A 10 -tq 1 oia*. i /i 'i/r ca 

PR00453B 14.65 8.568e-10 75-90 


789 




CARBAMOYLTRANSFERASE 
SIGNATURE 


t>T) AA 1 rtOD 1 /I 0*5 c A 1 Q«. Aft ft/CO 

rKUUlUziJ 14.82 D.41oe-U9 Voj- 

977 


790 


. BL00030 


Eukaryotic RNA-binding region RNP-1 
proteins. 


BL00030B 7.03 5.500e-l l 1 99- 
209 


7Q1 


OXjVUH 1 J 1 


oy naps ins proteins. 


iiL0U415N 4.29 9.519erl0 393- 
437 BL00415N 4.29 2.1 17e-09 

1 f\1 \A1 "DT C\(\A 1 <\l A iTOOa 

09 97-141 BL00415N4.29 
5.664e-09 387-431 


795 


PD01066 


PROTEIN ZINC FINGER ZINC- 

TH"KkTPP "N/fPTAT RTXrnTKir; "MT T 
r 11N\J.C,I\. ivLLZ 1 /^U-DLLNUirNLJ JNU. 


PD01066 19.43 2.091e-36 105-144 


799 


PF00731 


AIR carboxylase. 


PF00731C 23.16 7.333e-35 337- - 

rrUO/Jlb 19.47 7.429e-28 
299-336 PF0073 1A 19.32 6.333e- 
24 268-297 


804 


RT 00.170 


^■y wupiuim-iype pcptiuyi-proiyi cis-xrans 
isomerase signatur. 


"D7 flAlTnO OA 0*7 O A71 a Aft OAT 

337 


805 - 


RT 0067R 


i rp*/\sp ^wjjj repeat proiciiis protciiLS* 


DT AA/CTO A £T U ylAA« 1A 1TO 1 Oft 

tsi^uuo/o y.o/ J.40Ue-iu i/o-ioy 
BL00678 9.67 5. 800e-10 418-429 


806 


PD01719 


PRECURSOR GLYCOPROTEIN 
SIGNAL RE 


PD01719A 12.89 7.571e-14290- 

J 10 


807 


PR00320 


G-PROTEIN BETA WD-40 REPEAT . 
SIGNATURE ■ " * ' 


PR00320B 12.19 9.100e-09451- 


809 


BL00107 


Protein kinases ATP-binding region 


BL00107A 18.39 4.462e-12 564- 


810 


PR00453 


VON WELLEBRAND FACTOR TYPE 
A DOMAIN STGNATTTRF 


PR00453A 12.79 1.3 lOe- 14 36-54 

PPfinA^Tl 1/1 Q C/COn 1A7C OA 

x^ivuu^jjjD 14. oD o.jooe-iu /j-yu 


814 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.047e-31 16-55 


815 


photon 


FINGER METAL-BENDING NU. 


rlXUUoo 19.43 2.047e-3 1 16-55 


817 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 5.154e-36 125- 
154 PR00193E 19.47 3.919e-18 
179-208 


818 


PR00830 


ENDOPEPTEDASE LA (LON) SERINE 


PR00830A 8.41 9.571e-ll 115- 
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ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






PROTEASE (SI 6) SIGNATURE 


135 


819 


BL00.126 


S'S'-cyclic nucleotide phosphodiesterases 
pruicins. 


BL00126C 22.07 7.857e-24 528- 
joy DJLUUlzob j5.zz j.714e-l j 
669-724 BL00 126D 25.50 1.1 73e- 
14 584-623 BL00126B 15.20 

l.UUUe-1^ DVZ-j 14 DiAIUlZOA 

27.56 3.361e-09 461-498 




rivwj i i 


TFTcTTDvJ ^TfTNATT TPF 
1 .Crv. 1 JLLN kjlVjJNri 1 UrvD 


rivUUjlliJ IZ.O o.ozoe-zz 1 /4- 

195 PR00511A 13.59 7.723e-ll' 


821 


BL00741 


Guanine-nucleotide dissociation 


BL00741B 14.27 2.800e- 15 13-36 


822 


PF00780 


Domain found in NIKl-like kinases, 


PF007801 14.69 4.825e-09231- 

ZOl 


827 


BL00030 


Eukaryotic RNA-binding region RNP-1 
proteins. 


BL00030A 14.39 5.235e-ll 144- 
163 


828 


BL00326 


Tropomyosins proteins. 


BL00326D 8.76 9.357e-l 1 545-. 
586 


829 


PD02448 


TRANSCRIPTION PROTEIN DNA- 
BMDIN. 


PD02448A 9.37 1.000e-40 46-85 
PD02448B 10.17 1.000e-40 85- 
133 PD02448C 13.62 1.000e-40 
152-189 PD02448E 11.33 9.000e- 

T A nr O^ 1 TVTNAOvl A OT 1 A 

30 235-261 PD02448F 14.22 
9.654e-25 279-303 PD02448D 

11 AO 1 CC Oa 10 1 n*7 Oil 

11.4o 3. ODyt- lis 197-211 
PD02448G 10.73-7.857C-16 SOS- 
SIS ■ 


OJU 


JDJOUU / ZU 


ouanine-nucieouue Dissociation 
^>LunuiaLurb v^l>*v-.z.j iamiiy sign. 


TQT AATOA'Q 1£ <*7 >1 CAA« 0"3 ylO 

niuvv fZvB lO.j/ 4oU0e-23'4S3- . 

^ft7 

j\j 1 . 


831 


BL00107 


Protein kinases A TP-bin ding region 

jJI UlCUlo. 


BL00107A 18.39 6.625e-21 143- 

L /H JoJLUUlU/JD 13.31 **u.\ 4e-iu 

213-229 


832 




lviiiucnoiiQTiai energy transicr proteins. 


RT flftOl ? A 1 CO < 7C7o 1 1 n C*7 
OlAHMljA 10. J./o/e-lJ 


833 


PR00497 


NEUTROPHIL CYTOSOL FACTOR 


PR00497A 6.92 4.375e-09 41-59 


834 


BL00229 


Tau and MAP proteins tubulin-binding 


BL00229A 23.57 9.565e-10 99- 

1 oo 

1 JO 


835 


BL00421 


Transmembrane 4 family proteins. 


BL00421E 20.97 2.216e-09 1053- 
1083 . 


836 


BL00795 


Involucrin proteins. 


BL00795B 12.41 7.931e-09 405- 
445 


837 


PR00020 


MAM DOMAIN SIGNATURE 

' . . •. 


PR00020A 18.17 l.OQOe- 17 34-53 
rKuuuZuii Ij.jz 5.o4oe-16 oo-85 
PR00020D 12.70 2.543e-15 147- 

i £7 dtjaaaoa./" 1 1*3 ££"3 ylCOa i-i 
loz rKUUUZUU 13.00 3.4o3e-13 

95-107 PR00020E 8.64 6.586e-13 

170 

IOj-1 ly 


838 


BL50017 


Death domain proteins profile. 


BL50017B 17.60 6.897e-13 1499- 
1515 


839 


PF00850 


Histone deacetylase family. 


PF00850C 14.55 9.542e-09 1352- 
1369 






Auk repeat proteins. 


rr 0002 s A 16.03 4;500e-I2 44-60 
PF00023B 1420 7.923e-l 1 73-83 
PF00023B 14.20 9.000e-10 139- 
149 PF00023B 14 70 S Sflflp-ftQ 
40-50 


842 


BL01194 


Ribosomal protein L15e proteins. 


BL01194B 13.66 1.000e-40 37-85 
BL01194C 12.35 9.250e-40 103- 
138 BL01194A 18.70 7.632e-33 
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m 

NO: 


ACCESSION 
NO. 


DESCRIPTION 










2-37 BL01194D 19.02 2.658e-36 
139-178 


843 


BL00610 


Sodium:neurotransmitter symporter 
family proteins. 


BL00610A 17.73 1.000e-40 40-90 , 
BL00610B 23.65 1 .000*40 104- . 
154 BL00610C 12.94 1.000e-40 
206-258 BL00610E 20.34 l.OOOe- 

/in ore -> no dt rvA/r i at? on rvo 

4U 355-398 BL0061 OF 29.02 
1 .000e-40 454-509 BL00610D 
20.97 6.063e-35 272-325 
BL00610G 12.89 8.588e-13 514- 

S37 


845 


BL00143 


Insulinase family, zinc-binding region 


BL00143A 20.91 4.300e-20 94- 

171 TXI AA1/43r* l/t l/C < CAAfl T3 

oJLUulnJU 14. 10 D.jUUe-1 3 

245-258 BL00143B 14.41 9.053e- 

1 A "M l_i <A 

1U 1HL-LDO 


846 


PR00543 


OESTROGEN RECEPTOR 
SIGNATURE 


PR00543D 10.87 1.355e-09 898- 
914 


847 


PR00543 


OESTROGEN RECEPTOR 
SIGNATURE 


PR00543D 10.87 1.355e-09 898- 
914 


848 


BL00824 


Elongation factor 1 beta/beta'/delta chain 
proteins. 


BL00824C 14.58 1.000e-40 129- 
167 BL00824D 14.04 6.1 92e-39 
167-202 BL00824B 9.21 2.080e- 
xl yo-110 DlA)Uoz4Jb ilAy 
3.333e-19 210-226 BL00824A 

13 7R 5t /^n*» 14 10 3A 
1.5. /o o.UJve-14 l?-j*t 


849 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BrNDING NU. 


PD01066 19.43 1.000e-40 12-51 


850 


PD01066 


PROTEIN ZINC FINGER ZINC- 

FrNHTFT? MPT AT -RT>JTiTMn "MTT 


PD01066 19.43 7.316e-24 10-49, 


852 


BL01272 


Glucokinase regulatory protein family 
proteins. 


BL01272B 19.61 6.870e-30 136- 

1/1 JtSLUlz/ZL* 11. Do J.314e-ZJ 

249-274 BL01272A6.49 1.231e-. 

l Q OQ 1 17 
15 yy-ll I 


853 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 9.341e-20 65- 
106 


854 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 


PD00289 9.97 6.850e-ll 140-154 


858 


PR00450 
• 


RECOVERIN FAMILY SIGNATURE 


PR00450C 12.22 3.250e-25 68-90 
PR00450B 11.76 8.125e-23 22-42 
PR00450D 16.58 8.920e-22 92- , 

1 12 rKUU4D01i lZ.14 l.Dole-iy 

1 1A l 33 T)TjAf\4 f Ar; 1 < 33 < <An^ 
114-Ijj JrxvUU4DULi IDoo j.juue- 

19 166-187 PR00450F 12.30 
13.58 1.857e-14 8-23 


860 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 7.188e-27 74-117 


866 


BL00477 


Alpha-2-niacroglobulin family thiolester 
region proteins. 


BL00477L 23.51 7.480e-20 54-87 


867 

o\j i 


RT fll07R 

JL>JL<v JL \J 1 O 


iviuiyuucnum coiacior Diosynuiesis 
proteins. 


J3JLUlU/oJ5 14«ZU l.Oile-ZU 4Uo- 

429 BL01078A 10.16 2.000e-13 
366-379 BL01078D 5.99 3.455e- 
11 566-576 BL01078C 10.52 
3.793e-l 1501-513 




"RT 0 11 77 
xjjla/ kilt 


-ftnapnyiaioxm oomam proteins. 


tSLU 1 1 77b 2U.o4 o.oOUe-24 462- 
489 BL01177C 17 39 5 333e-19 
416-435 BL01177B 13.61 7.840e- 
16 122-138 BL01177D 17.50 
1.900e-15 441-459 


869 


BL01177 


Anaphylatoxin domain proteins. 


BL01 177E 20.64 5.800e-24 415- 
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DESCRIPTION 


RESULTS* 








442 BL01177C 17.39 5.333e-19 
369-388 BL01177B 13.61 7.840e- 
16 122-138 BL01177D 17.50 
1.900e- 15 394-412 


871 


BL50007 


Phosphatidylinositol-specific 
phospholipase X-box domain proteins 
prof. 


BL50007A 19.61 1.000e-40 322-. 
joo djljUuu/jj iioh- i.uuue-4U 
589-631 BL50007B 20.90 6.700e- 

3£ 383-491 T5T ^00071? 9^ £3 

9.053e-33 748-785 BL50007C 

R 07 ^ 900p-1 0 4S9-4AQ 


872 


BL00972 


Ubiquitin carboxyl-terminal hydrolases 

■fami lv 9 nrMpitic 
laiLlLly £. JJIUlCJilo. 


BL00972D 22.55 3.250e-17 90- 


874 

O /*T 


PR 004 S9 


DOM ATM ^TfrNFAnTRP 


PP004^9R 1 1 fiS 4 9S0p-0Q 370 

386 


877 

Oil 


Rl 00741 

JJi^vu /HI 


friiU'ni'nP— TMif*lA/vhrf<* /4iecrtf*i5itir*n 

stimulators CDC24 family sign. 


RT 0074 IT* 14 97 ^ ^00p-13 1 343 

1366 


R7R 




PT* OT TWP-R TPH PR OTPFM 3 


nM0091 ^ 10 43 9 ^9 00 ^9 8^ 


OO 1 


PH09R07 


A POT TPHPROTPTW P PPPPTTP^nP 

APO-E GLYCOPROTEIN PLAS. 


PTW? 80/717 1 fl QA A 7H9o HQ 

407 


882 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.188e-37 8-47 . 


885 


PF00023 


Ank repeat proteins. 


PF00023A .16.03 8.071e-09 10-26 


886 


PR00372 


BIOPTERIN-DEPENDENT 
AROMATIC AMINO ACID 
HYDROXYLASE SIGNATURE 


PR00372B 10.30 9.308e-27 225- 
248 PR00372A 13.39 7.000e-24 
134-154 PR00372E 12.62 2.125e- 
23 360- j bO PR00372C 7.90 
3.025e-22 289-309 PR00372F 
13.09 6.333e-2 1395-414 

Dl? Art 3*79 T\ 1 A 99 1 (\f\f\a 1Q ?OQ 

348 


RR7 




GTP-binding elongation factors proteins. 


tit nA3n 1 d on no 9 oaa^ o/i 1 m 
135 BL00301A 12.41 4.316e-13 

9T 33 


888 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proicins. 


BL00518 12.23 1.667e-09 30-39 


889 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 4.906e-26 6-45 


890 


DM00179 


w KINASE ALPHA ADHESION T- 
CELL. 


DM00179 13.97 7.652e-09 113- 
123 




TJT rtl A99 
DUJ 1 UZZ 


riisJ. iamiiy proton/oiigopepiide 
symporters proteins. 


Til ai noon - n to / Ai _ t a ti 
dLU 1 U22i5 22.19 o.Ol oe- 14 72- 

118 BL01022E 23.51 1.173e-12 

479 ^nc nT aia99A 1 1 « 0 
4/zOUo JtsJLUlUzzA 1 i.jo y.lo 3e- 

12 42-61 BL01022D9.42 3.455e- 
1 1 1 00 9 1 9 


893. : 


PD02407 


3-BISPHOSPHOGLYCERATE- 
INDEPENDENT PHOSPHOGLYCER. 


PD02407K 12.59 6.529e-10 360- 
383 


894 


PD02407 


3-BISPHOSPHOGLYCERATE- 
INDEPENDENT PHOSPHOGLYCER. 


PD02407K 12.59 6.529e-10 360- 
383 


895 


PR00237 


RHODOPSIN-LDCE GPCR 
SUPERFAMILY SIGNATURE 


PR00237B 13.50 9.100e-14 116- 
138 PR00237F 13.57 L360e-13 
312-337 PR00237G 19.63 9.069e- 
13 353-380 PK00237E 13.03 
7.120e-12 243-267 PR00237D 
8.94 4.150e-ll 194-216 

PR009^7A 1 1 4R 4 37^p-l 1 

IIVUUXJ 1 r\ l I .HO H.J / JC'l X 0_> 

108 


896 


BL00129 


Glycosyl hydrolases family 31 proteins. 


BL00129D 16.76 8.258e-26 634- 
678 BL00129A26^1 1.720e-25 ■ 
384-430 BL00129E 22.60 4.857e- 
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RESULTS* 








9^ fiQR-734 RT flfl19Qp k n 

19.19 5.891e-18 495-522 
BL00129F26 19 7 545e-lS 814- 
852 


897 


BL00598 


Chromo domain proteins. 


BL00598 14.45 1.220e-13 9-31 


898 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 6.000e-09 396-405 


899 


PD01101 


INHIBITOR HEAVY CHAIN 

PHA>JXnPT TTsJ 
v^n/A i> in JC/J_< in.- 


PD0 1 1 0 IB 2 1 .53 1 .000e-40 274- 
djl/ r u\j i x\j iu zh.'Vd i.uuue-^u 
457-512 PD01 101A 18.25 6.268e- 
23 83-117 PD01101C 12.69 

6.73 7.750e-12 566-576 


900 


PR00600 


PROTFTNF PWO^sPUATA^F PP9A SSim 
REGULATORY STJBUNIT 
SIGNATURE 




901 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 8.116e-31 24-63 


903 


BL01115 


G TP-binding nuclear protein ran proteins. 


BL01115A 10.22 1.509e-l 1 21-65 


906 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.174e-13 539- 
572 DM00215 19.43 4.750e-12 
549-582 DM00215 19.43 9.824e- 
11 551-584 DM00215 19.43 

19.43 4.054e- 10 550-583 

585 DM00215 19.43 7.107e-10 


907 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 6.276e-12 314- 

^39 


908 


BL00107 


Protein kinases ATP-binding region 
nrotein*! 

LSI uLwiiig . 


BL00107A 18.39 5.950e-17 1125-. 

IK/! 

1 X JU 


909 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 5.950e-17 1118- 
1 14Q 


910 


BL00107 


Protein kinases ATP-binding region 


BL00107A 18.39 8.560e-13 150- 

1R1 


911 


BL00107 


Protein Icinase** ATP-hindinc* Teainn 

proteins. 


181 


912 


PF00856 


SET domain proteins. 


PF00856A 26.14 4.553e-l 1 243- 
280 


913 


PF00628 


PHD-finger. 


PF00628 15.84 6.400e-13 197-212 


914 


PR00962 


LETHAL(2) GIANT LARVAE 
PROTEIN SIGNATURE . 


PR00962D 10.40 1.000e-27 435- 
459 PR00962G 15.71 4.086e-26 
593-618 PR00962B 11.98 9.122e- 
26 296-319 PR00962A 13.28 
6.143e-22 15-34 PR00 962 C 8.00 
4.000e-21 348-369 PR00962F 
iz.iy y. /oye-zi dd/o 
PR00962H 13.32 2.636e-20 623- 
643 PR009621 11.68 9.786e-20 
692-712 PR00962E 8.8 12.91 5e- 
18 515-534 


915 


PR00962 


LETHAL(2) GIANT LARVAE 
PROTEIN SIGNATURE 


PR00962D 10.40 1.000e-27 365- 
389 PR00962G 15 71 4 086e-?6 
523-548 PR00962A 13.28 6.143e- 
22 15-34 PR00962C8.00 4.000e- 
21 278-299 PR00962F 12.39 
9.769e-21 482-502 PR00962H 
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SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








13.32 2.636e-20 553-573 
PR009621 11.68 9.786e-20 622- 
642 PR00962E8.81 2.915e-18 
445-464 


916 


BL00134 


Serine proteases, trypsin family, histidine 
proteins. 


BL00134A 1 1.96 5.886e-14 90- 
107 


917 


BL00478 


LIM domain proteins. 


BL00478B 14.79 8.393e-13 21 1- 
226 BL00478B 14.79 6.7 12e- 10 
271-286 


918 


PR00049 


WILM S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 5.729e-09 973- 
988 


.922 


BL00150 


Acylphosphatase proteins. 


BL00150 25.33 1.000e-40 37-84 


924 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031B 15.41 8.063e-09 79- 
113 


925 


BL00072 


Acyl-CoA dehydrogenases proteins. 


BL00072D 30.08 2.837e-24 280- 
33 1 BL00072E 24. 1 2 8.200e-24 
368-411 BL00072C 25.30 7.873e- 
20 226-267 BL00072B 9.48 
6.049e-12 183-196 


927 


BL00237 


G-protein coupled receptors proteins. 


BL00237C 13.19 1.692*13 229- 
256 BL00237A 27.68 6.657e-13 
90-130 BL00237D 1 1.23 9.571e- 
13 290-307 


928 


BL01033 


Globins profile. 


BL01033A 16.94 7.923 e- 18 25-47 
BL01033B 13.81 1.000e-15 93- 
105 ' : 


929 


BL00216 


Sugar transport proteins. 


BL00216B 27.64 8.714*13 203- 

253 ' ■ 


932 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.519e-10 353- 
397 BL00415N429 2.117e-09 
63-107 BL00415N4.29 3.628e-09 
57-101 BL00415N4.29 5.664e-09 
347-391 


933 


PD02448 


TRANSCRIPTION PROTEIN DNA- 
BINDIN. 


PD02448A 9.37 1.000e-40 46-85 
PD02448B 10.17 1.000e-40 85- 
133 PD02448C 13.62 1.000e-40 
152-189 PD02448E 11.33 9.000e- 
30 223-249 PD02448F 14.22 
9.654e-25 267-291 PD02448D 
11.48 3.659e-18 197-211 
PD02448G 10.73 7.857e-16293- *" 
306. 


934 


DM00191 


w SPAC8A4.04C RESISTANCE 
SPAC8A4.05C DAUNORUBICIN. 


DM00191D -13.94 9.083e-10 136- . 
175 . 


935 


BL01115 


GTP-binding nuclear protein ran proteins. 


BL01115A 10.22 4.696e-10 67- 
111 


936 


BL00019 


Actinin-type actin-binding domain 
proteins. 


BL00019D 15.33 8.138e-14 865- 
895 


937. 


PR00762 


CHLORIDE CHANNEL SIGNATURE 


PR00762A 14.22 4.000e-22 183- 
201 PR00762C9.29 1.000e-21 
268-288 PR00762E 12.07 3.250e- 
20 520-537 PR00762D 1 1 .29 
l:000e-l 9 470-491 PR00762F 
15.12 1.429e-19 538-558 
PR00762B 12.12 1.818e-18 214- 
234 PR00762G 14.13 3.455e-17 
577-592 


938 


BLO0027 


'Homeobox' domain proteins. 


BL00027 26.43 9.500e-25 291-334 


939 


DM01111 


4 kw PHOSPHATASE 


DM01 1 1 IE 17.28 1 .568e-10 248- 
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SEQ 
10 

NO- 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






TRANSFORMING 61K PDF1. 


297 DM01 11 IE 17.28 5.168*10 
659-708 DM01 11 ID 16.76 

<. 7/\3*» flO 770 "TiTViffil 1 1 1 \JT 

10.67 8.674e-09 91 1-935 


940 


OL*\J\J l \J f 


jrruicin Kinases a i r-OLDuLng region 
proteins. 


tjt aaiatq 10 01 1 aaa^ ta oao 
309 BL00107A 18.39 6.760e-13 

77Q-7A0 


942 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 19.54 9.832e- 11 543- 
597 




ru\j i uoo 


FINGER METAL-BINDING NU. 


rXKJlUoo ly.43 3.MJ0eo5 b-47 




RT OfiQRQ 


^jauinn adaptor complexes srna.ii cnain 
proteins. 


TIT AAQOQ'D <1 1 AAfta Af\ £.£. 

Di^UUyoyxJ Zo.jI l.UUUe-4U 00- 

1)7 BL00989A 11.66 1.000e-13 


946 


i iwU I/O 


fatty a rTn-RT>jrnrKfn PTJOTPTTsI 
SIGNATURE 


PT?flfi1 7Rr* 1 7 <7 Q <1t a AO A Zf\ 
risSJ\) 1 10D 13. y.D/ie-Uy 43U- 

469 


947 


RT .001 7R 


/iniiiiQacyi-uaiisier xsjna syninciascs 
class-I proteins. 


t5U\)V L /od /.ll 4.oj/e-Uy lio- 

724 


948 


PF00628 


PHD-finger. 


PF00628 15.84 8.412e-14 201-216 


yj i 


rt on.7 1 £ 


Sugar transport proteins. 


rJLUUzloB 27.64 2.050e-10 180- 


952 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 4.300e-l 1 26-49 
PR00926F 17.75 6.348e-09 134- 
157 




ppoo i no 

rr\j\j ivy 


Beta-ketoacyl synthase. 


TyCAAlAA 1 O AO O Ovl^n 10 *5 on 

rrUOlOy 13.08 2.846e- 12 342-357 


. 957 


PR00069 


ALDO-KETO REDUCTASE 

^TfTWATTTPP 

OlOlNA 1 UJt\JD . . 


PR00069A 16.01 8.826e-24 26-51 

DTJAAA/TCD 11 "5? 1 CIA* in 

rKUUUoyo ll.jj l.jl4e-17oo- 
105 PR00069C 16.03 8.81 6e-14 


958 


PF00583 


Acetyltransferase (GNAT) family. 


PF00583A 12.53 5.500e-10 631- 

£/17 
04Z 


961 


PR00328 


GTP-BINDING SARI PROTEIN 
STGMATTTRF 


PR00328A 10.62 8.740e-10 7-31 


962 


RT 00^4 


XXIVXVJ JL dUU XXLV1VJ- I iyiN/\-UlIJUUljI 

domain proteins (A+T-hook). 


1499 


963 


RT 003^4. 


JXIVAVJ-Jl dJJU niYlVJ- I i^lN/\-DUlQing ■ 

domain proteins (A+T-hook). 


1499 


964 


rt 00077 


numeuooA uornain proteins. 


xiJL^UUUx/ 2o.4j /.lolSe-Z/ 33-yo 


965 


PF00992 


Troponin. 


PF00992A 16.67 2.421e-09 581- 

010 


966 


PR00515 


5-HYDROXYTRYPTAMINE IF 
RECEPTOR SIGNATURE 


PR00515D 7.91 5.741e-09 13-33 . 


967 


RT 00^70 


xviDOSomai protein L-zy proteins. 


"DT AAC7AT5 0 1 AA C f\C C ^ oi l^vl 

13L00579J3 21.99 5.0 65e-21 164- 
194 


970 




rumaraie reaucuise / succinate 
dehydrogenase FAD-binding site . 

JJi LFLCUlo. 


X5JLU0504C 1 o.oo 2.227e-24 34-59 
BL00504D 10.43 7.261e-21 75-93 


973 


PF00580 


UvrD/REP helicase. 


PF00580A 13.37 4.720e-09 249- 

771 


974 


PR00456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456F 5.86 l.OOOe- 10 242-254 


97S 


RT 00737 


G-protein coupled receptors proteins. 


DT AAin A /ZO A A 'Tfl ^ n An 

J5L0U23 / A 27.05 4.429e-22 99- 
139 


976 


BL00031 


Nuclear hormones receptors DNA- 
bindinf? region nroteins 


BL00031A 19.55 7.158e-33 60-93 

"RT 000^ IP. 7? 7$ S ^nnp-7R 04- 

126 


977 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 8.200e-16 196-209 
c PD00066 13.92 8.200e-16 336-349 
°PD00066 13.92 2.385e-15 476-489 
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SEQ 
ED 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








PD00066 13.92 9.308e- 15 252-265 
PD00066 13.92 2.800e-14 448-461 
PD00066 13.92 4.600e-14 392-405 
rDOOOoo 13.92 5.2Q0e-14 280-293 
PD00066 13.92 4.000e-13 224-237 
PD00066 13.92 4.429e-12 308-321 
PD00066 13.92 9.571e- 12 420-433 
PD00066 13.92 6.870e-ll 168-181 


978 


BL00721 


Formate-tetrahydrofolate ligase proteins. 


BL00721B 13.21 1.000*40 346- 
401 BL00721D 13.90 1.000e-40 
538-592 BL0072 IE 13.46 LOOOe- 
40 597-646 BL007211 18.79 
2.500e-40 814-860 BL00721H 
21.20 8.239e-39 763-814 
BL00721A 15.31 9.719e-32 287- 
321 BL00721C 16.92 4.000e-30 
498-535 BLU0721F 15.96 8.232e- 
27 660-702 BL00721G7.97 
3.017e-10 721-734 


981 


PD00126 


PROTEIN REPEAT DOMAIN TPR 
NUCLEA. 


PD00126A 22.53 2.552e-09 180- 
201 


982 


BL00869 


Renal dipeptidase proteins. 


BL00869C 12.58 3.172e-19 59-95 
BL00869E 13.12 9.129e-18 120- 
157 BL00869J 15.60 6.032e-17 
270-310 BL00869H 11.08 1.840e- 
16 219-242 BL00S69G 13.55 
2.543e-16 192-214 BL00869F 
12.77 7.031e-14 157-192 
BL00869I 12.92 3.274e-12 242- " 
270 BL00869D 14.02 5282erl0 

10 31-61 


983 


PR00196 


ANNEXIN FAMILY SIGNATURE 


PR00196F 13.89 2.125e-09 92-108 


984 


BL00485 


Adenosine and AMP deaminase proteins. 


BL00485D 30.82 2;427e-10 154- 
209 



* Results include in order: accession number subtype; raw score; p-value; position of signature in amino acid 
sequence 



TABLE 4 



SEQ ID 


PFAMNAME 


DESCRIPTION 


p-value 


PFAM 


NO: 






SCORE 


2 


ig 


Immunoglobulin domain 


3.9e-17 


60.3 


3 


HSP90 


Hsp90 protein 


0 


1548.4 


6 


tsp 1 


Thrombospondin type 1 domain 


0.002 


22.1 . 


7 


7tm_l 


7 transmembrane receptor (rhodopsin 
family) 


6.7e-08 


27.3 


9 


PWWP 


PWWP domain 


8.1e-16 


66.0 


12 


Clq 


Clq domain 


1.7e-26 


101.5 


13 


Clq 


Clq domain 


2e-20 


81.3 


14 


Aa_trans 


Transmembrane amino acid 
transporter protein 


2.7e-42 


153.9 


15 


E1-E2 ATPase 


E1-E2 ATPase 


6.3e-124 


412.2 


16 


trypsin 


Trypsin 


1.2e-87 


278.6 


37 


ig 


Immunoglobulin domain 


7.6e-12 


43.2 


18 


lectin c 


Lectin C-type domain 


0.0003 


21.2 


20 


Alpha_L_fucos 


Alpha-L-fiicosidase 


1.2e-217 


736.5 
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SEQID 
NO: 


PFAM NAME 

: 


DESCRIPTION 


p-value 


PFAM 
SCORE 


ZZ 


pkinase 


Eukaryotic protein kinase domain 


3.3e-87 


303.1 


Z5 


pkinase 


Eukaryotic protein kinase domain 


2.7e-85 


296.8 


OA 
ZH 


pkinase 


Eukaryotic protein kinase domain 


2.7e-85 


296.8 


O^ 

Zj 


auk 


Ank repeat 


5.5e-14 


59.9 


on 

Z 1 


pkinase 


Eukaryotic protein kinase domain 


1.5e-100 


347.4 


Zo 


spectrin 


Spectrin repeat 


4e-57 


203.2 


29 


spectrin 


Spectrin repeat 


4e-57 


203.2 


30 


WD40 


WD domain, G-beta repeat 


1.2e-07 


38.8 


33 


rrm 


RNA recognition motif. 


l.le-17 


72.2 


1A 

34 


rrm 


RNA recognition motif. 


l.le-17 


72.2 


3o 


- — . 


7 transmembrane receptor (rhodopsin 
family) 


3e-36 


117.3 


j / 


ank 


Ank repeat 


5.9e-25 


96.3 


38 


SRF-TF 


SRF-type transcription factor 


L4c-36 


133.9 


40 


alk_phosphatase 


Alkaline phosphatase 


0 


1034.9 


44 


ZI-C2H2 


Zinc finger, C2H2 type 


8.6e-103 


354.9 


4D 


sugar tr 


Sugar (and other) transporter 


3.1e-08 


40.3 


47 


7tm_2 


7 transmembrane receptor (Secretin 
family) 


6.4e-79 


275.6 


50 


zf-C2H2 


Zinc finger, C2H2 type 


1.3e-98 


341.0 


51 


filament 


Intermediate filament proteins 


1.2e-176 


600.3 


52 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


2.7*10 


37.7 


53 


Cadherin_C_ter 
m 


Cadherin cytoplasmic region 


1.9e-94 


327.2 


54 


S_100 


S-100/ICaBP type calcium binding 
domain 


5.2e-18 


73.3 


58 


inositol P 


Inositol monophosphatase family 


5e-13 


49.8 


59 


7tm_l 


7 transmembrane receptor (rhodopsin 
family) 


8.8e-46 


147.6 


ou 


Kunitz_Br n 


Kunitz/Bovine pancreatic trypsin 
inhibito 


3.7e-47 


148.6 


. AO 

oz 


DAD 


DAD tamily 


2.5e-74 


260.3 


63 


MOZ SAS 


MOZ/SAS family 


5.9e-133 


455.1 


04 


MUZ bAS 


MOZ/SAS family 


1.7e-123 


423.6 




ras 


Ras family 


9.3e-89 


308.3 


O / 


Ham 1 p_liKe 


Haml family 


3.7e-49 


176.7 


AC 


/tm_l 


7 transmembrane receptor (rhodopsin 
family) 


5.2e-39 


126.1 


/ V 




Zinc linger, Lzriz type 


1.5e-112 


387.3 


71 . 


Peptidase_M41 


Peptidase family M41 


1.2e-110 


381.0 


/2 


abhydrolase 


alpna/beta hydrolase fold 


9.8e-05 


26.5 


o 1 


K tetra 


K+ channel tetrainerisation domain 


0.022 


-16.8 


RO 
oz 


pkinase 


Eukaryotic protein kinase domain 


5e-49 


176.3 


64 


AAA 

AAA 


ATPases associated with various 
cellular act 


1.3e-77 


271.3 


oJ 


nomeoDox 


Homeobox domain 


1.4e-28 


108.3 


o / 


i Ur-Deta 


Transforming growth factor beta like 


6.7e-68 


210.2 


y l 


mito_carr 


Mitochondrial carrier proteins 


4.6e-57 


198.5 


?j 


aaeny laic tun ase 


Adenylate kinase 


l.le-15 


60.0 




>g 


Immunoglobulin domain 


4.1e-20 


69.8 


00 

77 


urNn 


L/Nri aomain 


3,4e-120 


412.7 




homeobox 


Homeobox domain 


7.4e-32 


119.3 




-rf nuo 


Zinc linger, C2H2 type 


2.2e-47 


170.8 


102 


zf-C2H2 


7>inc finapr tvnp 






103 


dynamin 


Dynamin family 


1.4e-150 


513.6 


104 


lectin c 


Lectin C-type domain 


4.2e-15 


63.6 


105 


lectin_c 


Lectin C-type domain * 


4.2e-15 


63.6 


108 


metalthio 


Metallothionein 


2e-25 


97.9 
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SEQID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


112 


HSP20 


Hsp20/alpha crystal! in family 


~2.6e-20 


77.7 


115 


EF TS 


Elongation factor TS 


3.8e-63 


221.1 


116 


sugarjtr 


Sugar (and other) transporter 


4e-63 


223.1 


118 


catalase 


Catalase 


0 


1158.9 


119 


UCH 


Ubiquitin carbpxyl-terminal 
hydrolase, famil 


le-10 


24.4 


122 


metalthio 


Metallothionein 


2.8e-25 


97.4 


125 


adh_short 


short chain dehydrogenase 


1.6e-45 


164.6 


126 


KRAB 


KRAB box 


7.9e-25 


95.9 


127 


G-alpha 


G-protein alpha subunit 


le-249 


843.0 


128 


mito carr 


Mitochondrial carrier proteins 


2e-65 


227.2 


131 


EF1BD 


EF-1 guanine nucleotide exchange 
domain 


4.9e-53 


189.6 


132 


GYF 


GYF domain 


4.9e-28 


106.6 


133 


GYF 


GYF domain 


4.9e-28 . 


106.6 


134 


lipocalin 


Lipocalin / cytosolic fatty-acid 
binding pr 


2.1e-33 


119.1 


135 


pkinase 


Eukaryotic protein kinase domain 


3.3e-86 


299.8 


136 


ank 


Ank repeat 


2.2e-29 


111.1 


137 


IL8 


Small cytokines 
(intecrine/chemokine), inter 


3.1e-18 . 


65.2 


139 


pyridoxal_deC 


Pyridoxal-dependent decarboxylase 
conse - 


0.00011 


19.0 


140 


cadherin 


Cadherin domain 


1.3e-88 


307.8 


142 


efhand 


EFhand 


5.7e-33 


123.0 


143 


Acyltransferase 


Acyltransferase 


2e-29 


111.2 


146 


cytochrome_c 


Cytochrome c 


1.7e-33 


124.7 


147 


pkinase 


Eukaryotic protein kinase domain 


2.3e-86 


300.3 


148 . 


PDZ 


PDZ domain (Also known as DHR or 
GLGF) j 


1.7e-09 


45.0 


149 


aldo_ket_red 


Aldo/keto reductase family 


7.4e-l 89 


640.8 


150 


homeobox 


Homeobox domain 


3.2e-08 


38.7 


151 


PseudoU_synth_ 
1 


tRNA pseudpuridine synthase 


4.7e-57 


203.0 


152 


abhydrolase 


alpha/beta hydrolase fold 


1.7e-31 


118.0 


153 


PDZ 


PDZ domain (Also known as DHR or 
GLGF). 


l.le-09 


45.6 


156 


PHD 


PHD-finger 


7.6e-15 


62.8 


157 


&3 


Fibronectin type HI domain 


0.015 


21.9 


158 


homeobox 


Homeobox domain 


2.7e-27 


104:1 


160 


PWI 


PWI domain 


3.9e-24 


93.6 


162 


DnaJ 


DnaJ domain 


2e-06 


34.8 


164 


Cbl_N 


CBL proto-oncogene N-terminal 
domain 


8e-117 


401.5 


166 


metalthio 


Metallothionein . 


3.1e-26 


100.6 


167 


LRR 


Leucine Rich Repeat 


0.00069 


26.3 


169 


fibrinogen_C 


Fibrinogen beta and gamma chains, 
C-term 


5.3e-180 


611.4 


170 


fibrinogen_C 


Fibrinogen beta and gamma chains, 
C-term 


5.3e-180 


611.4 


171 


fibrinogen_C 


Fibrinogen beta and gamma chains, 
C-term 


le-149 


510.8 


173 


homeobox 


Homeobox domain 


1.5e-29 


1 1 1.6 


174 


FYVE 


FYVE zinc finger 


7.4e-28 


103.8 


1 K 


CjKlr 


OKir domain 




4U.D 


182 


pkinase 


Eukaryotic protein kinase domain 


3.4e-71 


250.0 


185 


CAP GLY 


CAP-Gly domain 


5.6e-51 


182.8 


186 


TBC 


TBC domain 


2.2e-50 


180.8 


187 


TBC 


TBC domain 


2.2e-50 


180.8 
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SEQ ID 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 


1RR 


PD7 


PF)7 Hnmain f Alert Ifnrtwn nc FiRP nr 

GLGF). 






189 


Kelch 


Kelch motif 


5.2e-106 


365 6 


190 


Tronornvosin 


TroDOmvosins 

X 1 UL/Vlli J w v 11 It? 


3.8e-171 




192 


Rieske 


Rieske T2Fe-2Sl domain 


0.0016 


1 Q.J 


199 


ig 


Trnmunoplobulin domain 


5.9e-19 


I/O. 1 


202 


EGF 


EGF-like domain 


3.4e-54 


193.5 


903 • 


trefoil 


Trp'firnl fP-tvnp^ HfiTnnm 
iiciuii vy^j^j \j\JLiiail\ 






904 


TBC 


TRC domain 
xaj\s uuxuaxii 


R 5p-3R 

O.JC-jO 


1 J7.U 


205 


efhanri 

bXllOJJU 


FF hanrl 


v.\JU7U 


99 ^ 
ZZ.O 


206 


ISK Channel 


Rlnw vnHatyp-asitprl nAtr*ccinm 
oiuw vuiuxgcgaicu puuioMUixi 

channel 


0 0031 

viUUj l 




207 


trefoil 


Trefoil fP-tvnp^ domain 


9 Qp-4R 


173 7 


209 . 


Ribosomal SI 3 


Ribosomal nrotein ^13/918 


1 9p-7R 


974 7 


210 


lipmnnpvin 

X l w XXI U IJ C A 11 1 


1-TpmnT\pvin 
ncu i upc Alii 




991 S 


213 


TBC * 


TBC domain 


2.5e-48 


174.0 


91S 




ivxyugenic JDoSic auiziaiij 




1 TO C 






ftTOW mntif 




CO 0 


222 


m3 


Fibronectin type III domain 


7.3e-141 


481.4 


993 




v^oiuiwiropornyosin-iype aciin- 
uxiiuiiig pr 


y.jc-4/ 


1 /CO Q 

lOo.o 


224 


efhand 


EFhand 


6.1e-06 


33.2 


99S 

ZZj 


a icrin *fa 


Pterin 4 alpha carbinolamine 
dehydratase 


y.je-4x 




99R 




ABC transporter 


/I 1 & 1 1 A 

4.ie-i iu 


379.2 


934 


P 1 FiprPO Florid 

2 


C 1 fum lit r 

xii iamny. 


7 1q QO 


oiz.y 


93*i 


jc# i_i;erjrz_JL>err 
2 


El family 


1 AO 

i.oe-4o 


174.0 


937 
ZJ / 


P*MP99 ^lanHin 


x^ivir-zz/iiivir/ivjLr4dU/L^ia iamny 


i. /e-z3 


no l 


93R 


■ wpiuus_neurope 
P 


vcncuraxc cnuogenous opioids 
iicuiupc 


i Co i eo 




93Q 




Thi i L*9n/fttio initi Q*f"i/vn to r*t"/*ki* ^ A 
dlKoiyulll/ lllllJallUIl laLLUi J/\ 

hvmiiinp 
lijr kJtio iiic 


j.^e-itw- 


JJO.O 


240 


Am inn ovirfa^p 


Flavin prinfsinino aminp nviHacp 

X lav 111 UvilLuiillllg CUilillC UAlUadC 


9 Sp-1 1 

X.JC-1 1 


37 R 
j f .a 


" 243 


zf-C2H2 


7inr finapr P9T-T9 tvnp 


9 1p_00 


343 A 


244 


Band 7 


SPFH domain / Band 7 family 


2.3e-53 


190.7 


245 


aril' 
aXLtv 


A Til/' rftnaQt 


1 DO 

i.oe-oo 


"30T < 


246 




z^ihl linger, v^ztrz type 


o. /e-4y 


17c fl 


247 


actin 


Actin 


2.3e-42 


140.3 


94R 


Xiix^iuincii^rccop 


ER lumen protein retaining receptor 




coo c 


250 


PMP99 PlaiiHiri 


PTV/TP-99/FMP/MP9n/Pl5niHin familvr 




1 /to 0 


252 


("Vkl 1 a opn 


v^uiidgcn u lpic iiciiA repeal 
conieO 

wUIVJ y 




ce /; 
jo.O 


255 


C2 


C2 domain 




7 R 
/ .0 


257 


CAP_GLY 


CAP-Gly domain 


1.4e-20 


81.8 


260 


WD40 


\X/i 'l Hnmnin ^^■.'Kp'fQ rpnAQt 
wj-/ uuillalll, U UCLa repeal 


Q Op fH 


*51 ft <. 


261 




W/ |"l Horn a in fr— 'Kpto ronoof 
vv \j uuillalil, \J~UCla repeal 


Q Qft /TO 


Zl o.D 


262 


WD40 


\X/T^ rlnmain Kpfu rpr»p!»t 


0 Op £9 


9 1 R ^ 


263. 


fiofilin ADF 

WX111X1 ii 1 fi- 


f^rtfilin/ti*nr*r»m\/rtciTi_t\fnp opinn- 
v^vLiiiii/ u upuiiijrUoiii-iypc auini~ 

bindine Dr 


/.&e-zi 


oZ.O 


264 


Ribosomal LI 4 


Ribosomal nrotein T 14n/r 93p 


7.ZC-1U 


40 (\ 


265 


SAPA 


Sano<iin A -tvnp Hnmain 


4 4p-99 


103 4 


266 


SAPA 


Saposin A-type domain 


4.4e-27 


103.4 


267 


ABCjran 


ABC transporter 


9.5e-39 


142.2 


269 


RibosomalJU4 


Ribosomal protein L14p/L23e 


6.2e-62 


219.2 


270 


abhydrolase 


alpha/beta hydrolase fold 


0.042 J 


-3.3 


272 


ras 


Ras family 


4.3e-87 


302.8 
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SEQTD 
JNU: 


pfam Name 


DESCRIPTION 


p-yalue 


PFAM 


273 


rim 


RNA recognition motif. 


0.074 


14.6 


Z / J 


iipocaim 


jLiipucaiin / cyxosoiic iany-acia 

UlllUUlg pi 


O <a A 1 

z.De-4i 


140.4 


, Z / O 


lad 


JLXOO ICLLlJll jr 


1 1p-67 
x . l e-o / 


9^R ^ 


977 
z / / 


wvn 


\j uiljUlllll Uol UUAy 1-lClllLUlal 
nvrirnlacp *fi*mil 

Ijy Ul UJOOVj Xuliill 


1 9 P iji7 
1 .ze- 1 *r / 


^n^ 0 

JW.7 


278 


START 


START domain 
jiAivi uuiiiaiu 




44 1 


279 


WD40 


VV JL/ UWlllQ.il 1) \J— UOLa 1 CpCaL 


1 Rp-97 


104 7 


282 


vJ L/dldl 


f3-nat"f*h Hrtmnin 
vj paidi uviiiaili 


7 Rp-99 


OU.V 


287 


A nri r»rnl i fprat 

ikliU L/lUlli&lul 


RTG1 familv 

i-s X VJ 1 1 CL111 11 


1 9p-101 


JJl .Lf 


289 


KRAB 




7 1ft-91 


R9 R 


9Q3 


7tm 3 


/ 11 Ctll3lll CI 11 Ui CLUC J CUCUIUI 


J JC" / J 




— 7 J 


SET ' 


SPT domain 

OJO JL Ulrlllalll 


Sp-^0 


11^9 


296 


* Pyridox_oxidase 


Pyridoxamine 5 -phosphate oxidase 


1.3e-76 


268.0 


9Q7 


11 Hi 


jxiN/v recognition cnom. 




i (V) 0 


9QR 


TTHip m PtVi\/lt"ron 
uuic illCUiy lllall 


liV\iT* /f^f^f"^^ ■mptTi"wltr*;* , nc'PPi*noP "familv 
uuic/ VjV/yj incuiy lu alloicrabc leu JJliy 




■70.J 


299 


Ubie_methyltran 


ubiE/COQ5 methyltransferase family 


0.0024 


-118.1 


3ni 




1? A Tl/\T A Ti r\ f n/im rr Mi j+A/* nrAtti A 

r/\jj/rHAju-Dmaing L-yxocnrome 
reauciaSc 


7 7p /?1 


Zl J.J 


302 


G-patch 


G-patch domain 


3.1e-14 


60.7 


3m 


7tm 1 
/UTl_l 


7 transmembrane receptor (rhodopsin 

lajniiyj 


/. /e-4J 


1 1 c 0 


JKfO 


rn 


Irxl uomain 




1 /.O 


310 


7tm 1 


/ frail Cit\ Am n^nn a t^A/^AT^^jn^ / t* /"\ /^"r** r^irk 

/ umisro em crane receptor ^rnoaopsin 

Lalllliy ) 


i e-o*f 


97fl ft 


31 1 


RhrvHanpcp 


l?h rtHanpcp-lilV t> HAmnin 




996 7 


319 


tubulin 


TiiVmlin/Ft^7 ftimilv 
x liuuiiiiy jtloZj iaxiiiiy 






314 


S1JRF4 


STrRF4 fnmilv 




676 6 


325 


IMS 


imr>R/TTiiipR/cnTnR fiimilv 
ill j jo/ in uirJL>/aaiiiJL> laiiuiy 




9f)7 S 
zu /. J 


397 


ifduiici hi 


v^auiiciui uuiiiaJLil 




6 n 

J J D.U 


39Q 


"MAC 


MAP Hrimnin 


9 1p-9R 


lf)7 R 


330 


TP tranc. 
XL Li alio 


i iiuipiiatiuy iuiodhui Lraiidici piULCLU 


fk Sp-08 

U.JC"70 


T3R 7 


332 


TFIIS 


JL 1 alio 1^1 1|J11U11 lCLCLUI O 11^117110^ 


O.OC-UJ 


9Q 3 


337 




x-iull, imgcij \^z.3Tljl type 


J .OC-D 1 


916 6 


340 

•7*TV 


AIRS 


ATT? cvntVinQp rpln+pH nrntpin 

AU\ ojrlllllaoC IClalCU piULCUJ 




1 90 9 


343 


annPYin 
aiiiibAUi 


^\.llllvAlll 


4 fip-RO 


97Q 4 


346 


OLaUUllUl 


OlaUlllllil LalLXliy 


1 .OC"7U 


314 ft 


347 


Riho°.oma1 T 1 6 

ivi u/uowj ii ai i_ / 1 \j 


T^iHrmmnfll nrn+pin T lfi 

XvlL'VJijVJllACll jjl UIC111 L-i l\J 


4 fip-OQ 


34 Q 


348 


lactam ase_B 


Metallo-beta-lactamase superfamily 


0.012 


-6.0 


151 
-> j j. 


ClIlOlIU 


HjI JlcLIlU 


9 ^p-14 


61 n 

0 X .u 


3 S3 
j j j 




jucivLiii w-iypc uuindin 


i ^A„ns 


39 1 


354 


WD40 


WD domain, G-beta repeat 


2.2e-18 


74.5 


360 


iipocaim 


juipocaun / cyiosoiic iairy-acia 

ULLlUUlg, pi 


o.je-iu 


*2ft 1. 
JO.J 


362 


A pptvlnrnncf 
J rvt'C'iyin oiioi 


AfPtvllmncfpracp fCYW AT*} fnmilv 
/T.v/ttyjucuioiciabC ^vji\/\i j laiiiiiy 


u.ui/ xy 


94 0 




tPKTA-cvnt 1 


ijvin/\ oynuiciaseb class i ^i, xvi ana 




69 R 9 


366 


Sulfa tncp 


Rnlfatncp 


^ 1 P-99R 


77ft 6 


368 


START 


START dnmain 
*j i i uuiiialll 


J.OC"l 1 


SO S 


369 


LllviJllClOw 


FliKfirvrttif* nmtpin iVinncp Hnmnin 

J^lXTwCU jr^JUv U1ULOU1 Klilu^C' UUillalil 


9 4p-10 


41 3 


370 


ACBP 


ApvI i o A hinHino T^mtpin 

A^J J V«>UA UlUUlllg Ul ULC111 


4 4p-*ifi 


1 GQ 7 


371 


LUx-Ll i C13 W« 


HDRflrvrttif* rimfpin Irinncp H atd a in 




397 S 

r .J 


373 


EGF 


F>(^F-lilvP Hnmflin 

i-fvli lU\b V-l HI dill 


9 fip-19 


<54 3 

J*T. J 


375 




7 inn frnapr 09T49 Kmp 
^iiiwi i-ui^cij v-^xxi type 


R 9p-^4 


99S 4 

ZZ J.*T 


377 


KRAB 


KRAB box 


3.7e-27 


103.7 


379 


SET 


SET domain 


7.3e-61. 


215.6 


380 


Glyco transf_8 


Gly cosy 1 transferase family 8 


0.0028 


,40.1 


381 


zf-C2H2 


Zinc finger, C2H2 type 


4.3e-06 


33.7 


383 


Glyco_transf_8 


Glycosyl transferase family 8 


0.0028 


-40.1 
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* rAivi rx/YiTjuc* 


JJE*OK*M\Xr I iKJlS 


p-value 


x*t»AIM 


384 


RasGEF 


RasGEF domain 


8.1e-43 


155.7 


385 


TBC 


TBC domain 


0.017 


-66.6 


389 


G ly cos_transf_2 


Glycosyl transferases 


1.3e-15 


65.3 


390 


Na Ca Ex 


Sodium/calcium exchanger protein 


3.9e-105 


362.7 


391 


fh3 


Fibronectin type III domain 


4.1e-102 


352.6 


392 


fh3 


Fibronectin tvne HI domain 


3.4e-45 


163.6 


393 


fh3 


Fibronectin type III domain 


3.4e-45 


163.6 


394 


ldl recept b 


Low-densitv lioonrotein recerjtor 
repeat 


7:le-49 


175.8 


395 


Ribosomal L30 


Ribosomal protein L30p/L7e 


0.0023 


16.0 


396 


OxysterolJBP 


Oxysterol-binding protein 


1.5e-94 


327.5 


397 


RDS ROM1 


Peripherin/rom-1 


2.9e-33 


123.9 


399 


lactamase B 


Metallo-beta- lactamase sunerfamilv 


3.4e-39 


143 6 


402 


F-box 


F-box domain 


0.0002 


9R 1 


403 


CLP_protease 


CItj orntease 

^^l|rf p VlVtUV 


4.8e-64 




405 


Ribosomal L35 
Ae 


Ribosomal nrotein L15Ae 


6e-77 




406 


LIM 


LIM domain containing nrotein^ 


0 00021 


20.7 


410 


tRNA-synt lc 


tRNA synthetases class I (E and Q) 


le-236 


799.8 


411 


NTP transf 2 


Nucleotidyltransferase domain 


3.9e-16 


67.0 


412 


DEAD 


DEAD/DEAH box helicase 


0 00016 


17.2 


414 


DUF94 


Domain of unknown function DT JF94 


0 0001 1 

w.VW X 1 


26.9 


415 


tubulin 


Tubulin/FtsZ family 


4 5e-289 


973.7 


420 


SET 


SET domain 


3.3e-57 


203.5 


421 


WD40 


WD domain, G-beta repeat 


6.1e-29 


109.6 


423 


zf-C2H2 


Zinc finger C2H2 tvne 


1.5e-39 


144 Q 


424 


pkinase 


Eulcarvotic nrotein kinase domain 

JUU1VU1 j \J LlV LSI Is Lw XXX l\liiajv UwllIClXXl 


f! Qe-75 


9^1 R 


428 


LIM 


LTM^ domain contain irtf* nrotein q 


1.8e-34 


19fi 7 


431 


kazal 


TcTa7.a1-tvr»^ serine nrrrfp^icp in biHitnr 

i\(UiUi Vr* 3™ 1X1 w IJX vlwUov 1111111/ 1 LUX 

domain 


J . ( C" 1 o 


7^ R 

/ J.O 


432 


SH2 


Src homology domain 2 


1 .4e-67 


198.4 


433 


zf-C2H2 


Zinc finger, C2H2 type 


2.8e-144 


492.7 


434 


ras 


Ras family 


0.012 


-106.8 


436 


E1-E2 ATPase 


E1-E2 ATPase 


1.6e-117 


391.0 


437 


RNA_pol_A 


RNA polymerase alpha subunit 


0 


1077.7 


438 


PHD 


PHD-finger 


1.6e-ll 


51.7 


439 


lectin c 


Lectin O-tvne domain 


4.7e-30 


i lj.j 


440 


zf-C2H2 


. Zinc finger, C2H2 type . 


1.1 e-65 


231.6 


441 


arrest in 

EM 1 wdLlll 


AiTPQtin (nr ^-flritiopn^ 

AilbSlUl y Wl O CUiLXtiCll I 




OJO.l 


442 


aminotran ^ 

UIUlllVU Oil *S 


Aminftrrancfpracec plncc-TTF 

yi.lU.XlHJ Ll ul 1311/1 ao&j vicOo ULA 

Dvri d oxal-nh o 


O .XC'OU 


931 1 


443 


UCH-1 


Ubiauitin carboxvl-terminal 

UXVJHXLJLL1 VUi uv A J 1 Ujl till i mi 

hydrolases famil 


8.5e-12 


S7 6 


444 


CTF__NF1 


CTF/NF-I family 


2.6e-277 


934.6 


451 


T-box 


T-box 


3.8e-117 


402.6 


453 


Rieske 


Rieske [2Fe-2S] domain 


2.6e-13 


57.7 


454 


zf-C2H2 


Zinc finger, C2H2 type 


3.9e-64 


226.5 


456 


homeobox 


Homeobox domain 


2.8e-08 


38.9 


459 


i2 -> 


Immunoglobulin domain 


2.6e-20 


70.5 


460 


Hydrolase 


haloacid dehalogenase-like hydrolase 


4e-25 


96.9 


462 


rve 


Integrase core domain 


1.6e-13 


50.7. 


466 


CH 


CalDonin homoloev ( CHI domain 


2.4e-17 


71.1 


467 


CH 


CalDOnin homolr>Pv fPTTj domain 

T**au£S K/U.U.A UK/liUJlWgy yytfli. J Ulsl Hull i 


2.4e-17 


71 1 


468 


Sterol desat 


Sterol desaturase 


7.5e-38 


139.2 


469 


pro_isomerase 


Cyclophilin type peptidyl-prolyl cis- 
tr . 


2.6e-63 


220.9 


470 


Peptidase M24 


metallopeptidase family M24 


6e-08 


28.1 


471 


PDZ 


PDZ domain (Also known as DHR or 
GLGF). 


5.4e-129 


441.9 
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SEQID 
NO: 


FJr AWL IS AMU/ 


JJEoL-KlJr 1 ION 


p-value 


PFAM " 
SCORE 


479 


Kin din p- 


AAvh-lilrP DMA-Kin^ino Hrtmain 
\Yiy\j~iiX\.\~ VJ JNr\*uuiUing uUTTJain 


j.oe-uo 


71 0 


473 


zz 


Zinc fincer nrespnt in dvorronhin C*A 

A^lllW llXl^Vl pi&jCllL 111 UjroLl UL/lllil^ 




70 0 


474 


EF1G domain 


PI on nation factor 1 oamma 

xvjuii^cllium tauivj i g, all nil a, 

conserved doma 




^OS s 


475 

*T / J 


R fhosom al L^ 1 e 


Rihosomal nrotpin T ^1p 

IVi L»VJ jUIlldl LIIULCUJ LJ IC 


ft 1p-^fi 

D. 1 C"UO 


7^7 ^ 


476 


Clq 


f"*1o domain 

N-' 1 k-J UWlllO.111 


9 ^p-7S 


?fi^ 7 

^£OJ. / 


477 


SH3 


£T-T^ domain 
wiw uuiiiaiii 


1 1p-19 


JJ.O 


478 


MoaA NifB Pa 
aE 


mnaA / nifR / nnnP familv 
lliUajT. / llllxJ / pijljjj> Aallxlljr 


0 007 


-17 7 
-If./ 


479 


. FYVE 


FYVE zinc fineer 


9.3e-21 


7R 6 


480 


DNA nol A 


DNA Dolvmerase familv A 


2.3e-46 


167 4 


482 


adh short 


short chain dehydrogenase 


L2e-62 


771 6 


483 


ank 


Ank reneat 




71 9 


484 


IMS ' 


imnrVmucrVsamft "familv 

111! VJUI ill UwU/ OCUilXJ XulJtlllY 




7Q0 5 


486 


TIR 


TIR domain 

' X XXX UVaFlXlUlll 


3 7e-19 


67 R 


487 


FMO-like 


Flavin-hindincr monoowpenase-likp 

1 Id V 111 UUlUlllt^ XllVJllVJUA^gV'lladC 


o 


1475 S 


488 


I LWEQ 


1/LWEO domain 


9 5e-101 

7»->C~ 1 V/l 


^41 0 


495 


homeobox . 


Homeobox domain 


3.6e-06 


30.8 


497 


■nV inacp 
jjjtviiiaaG 


"Pnlf arvrttif* TYiwtpin Una Horn jjiti 
IJUKCU y\Jli\t piULCill KllleoC UUlUaUl 


x.JC-lOO 


300. 1 


499 
** 77 


11 1_? 


r lui uucuLiii type ill uuillaLU 




R01 R 


501 


x->ivl\. 


T PiipiriP T?i/*h TJprtPat 
l^cuciiic lviuu ivcpcal 




11^6 
1 13.0 


502 


RGS 


Regulator of G protein signaling 

UUillalll 


0.041 


11.9 


503 


filament 


Intermediate filament proteins 


Ie-142 


487.5 


SOS 




riorunecun type ill uoiiiain 


i ^p. i on 


/ 


506 




T-TPPT-Hnmain fnkiniiiim- 

ni_.v_/ 1 -UOIIlalll ^UDlCJUllIIl- j 

tran ofpra 

U ailbici abc J. 


1 p 1 "5 


O 


507 


RiViocnmal T 7 A 


R rVincrvmnl r\rr\tpin T 7 A p 
XVJUuaUlllal piUlClll / r\.C 


< 7^7/; 

D. /c-^O 


OQ 7 

77. / 


508 


WD40 


WD domain Ci-heta reneat 


0 063 


19 R 

I7.0 


509 


WD40 


WD domain G-heta reneat 


0.063 


19 R 

1 7.0 


510 


WD40 


WD domain, G-beta repeat 


2.1e-42 


154.3 


511 


rvk*inacp 


Pnlf srvririr nrntpin Hnncp drtmaiti 
jCfUJvoj y uiiK/ piv^LCiii iUilctoC UUillalll 




^00 A 


512 


fr-oamma 

gUllllllLL 


frfi! domain 


1 Qp-OR 




513 


SH3 


domain 

JIlJ UUillalll 




7 


515 


HTH AraC 

Xx x l A / vi aw 


T^artprial rpcriilatnrv VipliY^Tiirn-hpliY 

uaL/LCI lal IC^lxiaLUiy llClxA**tUI 11 llvllA 

protei 


0 .7C-Z / 


1 0 Vfi 


516 


zf-C2H2 


Zinc fincrer C1VO tvnp 

A^lllV^ Xxll^Vl ? V^iXl^ Ljr^JC 


1 7p-^4 


17R 0 


517 


SI 


SI RTvTA hindinu domain 


\J. 1 C'JO 


705 Q 


518 


oktnase 


Kukarvotic nrotein kinase domain 


1 Re-75 


764 7 


525 


cadherin 


Cadherin domain 


?p-R0 


7R0 6 


528 


zf-C2H2 


Zinc finder C2H2 tvne 

^JlllV lUJ^Vl j ^^>11£> LVL/W 


4e-70 


746 4 


529 


neur chan 


"Meurotransmitter-crated ion -channel 


5.8e-222 


750 R 


531 


RhoGEF 


RhoGEF domain 


3 5e-44 


160 7 


532 


myosin head 


Mvosin head fmotor domain^ 


0 


1494 5 


533 


LRR 


Leucine Rich Reneat 


R ^e-15 

O.JC" 1 J 


6? 6 


535 


Sec7 


Sec7 domain 




^19 1 


536 


homeobox 


Homeobox domain 


4.8e-05 


26.4 


539 


actin 


A rtin 


7 J.p-1 00 


^^0 f\ 


542 


ank 


Ank repeat 


1.9e-35 


131.2 


544 




£jLiiu iiiigcr o-Ao-v^-xj-vv-xj-n type 


7 Co 1 0 


A\ 7 


546 




I~)nal or»^r , ifif»if"v t^V» r»c»^Vi of/a pa 

i^udi bpcLiiiciiy pnospnauisc, 

catal'vtif* Hnmn 
\*aiaiy\ri\* uuiilo- 


7 Ap_40 


147 A 


547 


HMG_CoA_synt 


Hydroxymethylglutaryl-coenzyme A 
synthas 


0 


1250.8 


549 


laminin G 


Laminin G domain 


3.3e-76 


266.6 


551 


PHD 


PHD-finger 


0.008 


9.3 


552 


PDZ 


PDZ domain (Also known as DHR or 


0.0017 


25.0 



191 



WO 01/57190 



PCTYUSO 1/04098 



NO: 




DESCRIPTION 


p-vaiuc 


rfAlu 

SCORE 






GLGF). 






555 


WW 


WW domain 


L3e-24 


95.3 


558 


kinesin 


Kinesin motor domain - 


1.8e-176 


599.7 


559 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.00085 


16.5 


563 


efhand 


EFhand 


7.9e-ll 


49.4 


567 


PH 


PH domain 


7.8e-06 


25 9 


568 


PH 


PH domain 


3.1e-39 


143.8 


569 


Hist deacetyl 


Histone deacetvlase familv 


5.2e-106 


365 6 


570 


PDZ 


PDZ domain f Also known as DT-TR or 
GLGF). 


3.4e-20 


80.5 


571 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


le-16 


58.5 


573 


ubiquitin 


Ubiquitin family 


1.4e-08 


31.1 


574 


FH2 


Form in Homology 2 Domain 


1.3e-l 10 


380.9 


576 


serpin 


Serpins (serine protease inhibitors) 


4.3e-146 


496.4 


579 


zf-C2H2 


Zinc finger, C2H2 type 


5.7e-76 


265.8 


580 


pkinase 


Eukaryotic protein kinase domain 


6.9e-79 


275.5 


581 


RhoGAP 


RhoGAP domain 

l\Uv v« *1 ^4 Will Ull 1 


4.4e-53 


189.8 


582 


Ribosomal L7A 
e 


Ribosomal nrntpin T 7Ae 

XVJ Q.1 L/l U LVJUL1 JLj f 


0 098 




584 


kazal 


Kazal -tVDe serine orotease inhibitor 
domain 


2.2e-52 


187.4 


585 


LRR 


Leucine Rich Repeat 


4.4e-28 


106.7 


586 


PHD 


PHD-finger 


3.8e-12 


53.8 


588 


GTP1 OBG 


GTP1/OBG familv 






590 


Collagen 


f^ollapfin trinlf* hpliv rpnpflt (00 

copies) 




1 *i9 A 


591 


lvs 


C-tvt>e lvsoTvme/alnha-lactalbnmin 
family 


1 .6e-3 1 ' 


116.4 


596 


ACBP 


Acyl Co A binding protein 


0.0022 


-9.4 


597 


SNF2_N 


SNF2 and others N-terminal domain 


3.7e-98 


. 339.5 


600 


KRAB 


KRAB box 


1.3e-29 


111.8 


606 


LRR 


Leucine Rich Repeat 


le-05 


32.5 


607 


LRR 


Leucine Rich Repeat 


le-05 


32.5 


608 


WD40 


WD domain, G-beta repeat 


5.3e-23 


89.8 


610 


cpn60 TCP1 


TCP-l/cpn60 chaperonin family 


1.7e-237 


802.4 


613 


THF_DHG CY 
H 


Tetrahydrofolate 
dehvdro^enase/cvclohvdro 


4.9e-173 


588.3 


617 


JfTfi 


RNA recognition motif 


4e-14 


60.4 


618 


mn 


RNA recognition motif. 


4e-14 


60.4 


620 


cofilin ADF 


Cofilin/trortomvosin-rvrie actin- 
binding pr 


3e-06 


^4 9 


621 


Nod 


Putative snoRNA binding domain 


6.1e-95 


328.8 


622 


UCH-2 


Ubiquitin carboxyl-terminal 
hydrolase family 


5.8e-21 


83.1 


625 


zf-C2H2 


Zinc finger, C2H2 type 


2.5e-124 


426.4 


628 


DEAD 


DEAD/DEAH box helicase 


2.5e-68 


219.0 


632 


GST 


Glutathione S-transferases. 


4.8e-26 


89.0 


633 


5 nucleotidase 


5'-nucleotidase 


6.6e-248 


837.0 


636 


LIM 


LEM domain containing proteins 


1.6e-88 


307.5 


637 


pkinase 


Eukaryotic protein kinase domain 


1.5e-73 


257.8 


638 


MSP_domain 


MSP (Major sperm protein) domain 


8.4e-09 


42.7 


639 


metalthio 


Metallothionein 


2e-24 


94.6 


641 


zf-C2H2 


Zinc finger, C2H2 type 


6.1e-114 


391.9 


642 


Ribosomal_S28e 


Ribosomal protein S28e 


9.3e-48 


172.1 


643 


. Ribosomal_S5 


Ribosomal protein S5 


8.3e-87 


301.8 


646 


PHD 


PHD-fmger 


0.00025 


23.1 


647 


WD40 


WD domain, G-beta repeat 


1.5e-22 


88.4 
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SEQ ID 
NO* 


rr AIM INAJVLb 


JL>Jc^CKir I ION 


p-vaiue 


PFAM 


648 


Lioase GDSL 


Lir)ase/Acv]hvdro1a<;p with HFi^T - 
like motif 


U.UJ J 


9 9 


652 


zf-C2H2 


Zinc finger, C2H2 type 


4.1e-146 


498.8 


653 


historic 


Core histone H9A/H9R/H3/H4 


1 ?p-10 


4R 8 


654 


' zf-C2H2 


Zinc firiffer C9H9 tvnp 


1 0p-R7 
1 .7C-0 / 




655 


ras 


T?ac familv 

I'BJ loll lllj 


6 Ap-77 


9^0 n 


657 


zf-C3HC4 


Zinc fin per P3HP4 tvnp fRTNfi 
finoer^ 

XLU£\,l ) 


S ^p-1^ 

J .J V I J 




658 


STphosphatase 


Ser/Thr protein phosphatase 


2.6e-182 


619.1 


659 


zf-C2H2 


7inr finapr P9H9 tvnp 


1 ^p-0? 


^91 1 
jz l.l 


660 


zf-C2H2 


Zinc finger, C2H2 type 


1.5e-85 


297.6 






"Wti/*1pnci/iP Hirtlmcnhiitf'P L^itiaopc 
n UVylOUoJUC U.ipiiUdpJid.LC KlilaoCb 


1 .'tc- 1 1 7 


41 n 7 




IRF 


Tn tPTTPrrwi Tponlot/w "fa/*+fM* 

unci ici uii i cgLiiaiui y laCLur 
trflnQprinriftn f ^ 


7p 70 


70 K 


665 


4HPPD C 


4-n vHrnwnVi pnvlnvn i vntp 
dioxvpenase C term 

U1VA Y g vllUJV Lvl AX i 




UO.J 


666 


DEAD 


DEAD/DEAH box helicase 


4.8e-74 


237.1 


667 


DEAD 


DEAD/DEAH box helicase 


2.9e-70 


29S 1 


669 


nkinase 


Eu k arv otic nrotein kina<;p d Amain 


u. I Or J 


^99 9 


671 


homeohox 


Homeobox domain 

llWlllwUUVA UU111UU1 


U.U i 0 


iU.J 


678 


crystall 


Beta/Gamma crystallin 


4.7e-106 


365.8 


679 


WD40 


WU LlUllidlllj VJ-UCla. JCpCaL 


1 ,7C"UO 


^4 0 


680 


Keratin R2 


Tfprntin hi oh culfiir Pi 9 nrntAin 
JVCiaLiii, Illgll aUHUi JDZ, jJlULClii 




1 j.y 


682 


G-gamma 


GGL domain 


8.5e-33 


117.9 


UOJ 




uukjurjlu Uaruoxyi-ienriinai 

hvdrnlncp familv 
iiyuiuxdsc laiiiLiy 




ill./ 


686 


A APtvltran Q"F 


A cptvltrancflpmcp ff^^JAT^ familu 
AlA^CiyiU OiiDlCI OoC ^VJIN/Ajl J Lalllixy 


■ O.Oc-IU 


4/\ 4 


687 


7tm ] 


familv^ 






688 




Protea<!ome A-tvne and R-tvne 


U.JC'Ut 


99 S 7 


689 


SCP2 


SCP-2 sterol transfer familv 

*J\~rX X* OtVilVIl UullJlWl 1XUJ.1J.1 y 


6 9e-37 


1^6 1 

1JU. 1 


690 


TS-N 


TS-N domain 


0.041 


20.1 


692 


zf-C2H2 


Zinr fin per tvnp 


0 op-fin 


91 1 Q 


693 


ZJL XVI 1 l^iJ_/ 


MYTvTD fmirpr 




j.j 


694 


Oxvstpml BP 


Owctprol-Kindincy nrr\tpin 
\jj\.y aLwi vji-uiillilllg LJiULClll 




** j j. / 


695 


PDZ 


PDZ domain (Also known as DHR or 


1.3e-30 


115.1 


703 


Pentidase C2 


Oalnnin familv rvQtpinp nrntpac^ 


9 ^p-17^ 


JJ'D.U 


706 


filament 

J-LiCU.ll vll L 


Tntptmediatp filnmpnt nrntpinc 

Xlll.ClllI.WU.iaiC' 11LCU11CJUL JJiULGllio 


7 9p-107 


jOo.j 


710 


fihrinnopn C* 


FiKrin pti hp+a nnH oammn r*hair»e 
i iuj iiiugcu ucui aiiu gaiiiilla t'lialiiD, 

C-term 


/c-ov 


97S n 


711 


SH2 


Src homology domain 2 


2.3e-65 


192.1 


712 


ATP-svnt DF 


A TP Qvnthace F)eltn/Rncnr»n rTnnin 
a x i o_yiitij£iaCj i^cjLa/jDpoiiUii Ciiaxii 


fi nnn/i9 


ion 


713 


ARID 


AD 1 11 TiNA hind in o drimnin 

T\.X\JJ*S -1— '1 ^( A U Li ill 111 UUilldlil 


9p-1 7 


711 


714 


LBP BPI CETP 


LBP / RPT / PFTP familv 


O.OO" J*T 


19^ 7 


715 


RNA doI L 


TCT^JA nolvmpra^pQ T / 1^ to Ifi VTia 
suhunit 


H.OC-Hy 


1 /O.J 


716 


KRAB 


KRAB box 


1.3e-42 


155.0 


717 . 


mito carr 


Mitochondrial carrier proteins 


4.8e-38 


1 j j. j 


719 


Gal-bind lectin 


Vertebrate galactoside-binding lectin 


1.5e-25 


90.2 


726 


aldedh 


Aldphvdp dphvdroopnncp familv 




41 0 R 


728 


Glycos_transf_2 


Glycosyl transferases 


4c-21 


83.6 


734 


FT M? 


J-/l_ilViX UUillolil 


ze-j4 


1j£ /.o 


735 


PPS<V 
I In / _> 


i rote in piiuipnoittbc £.t\ rcguiaiory 
subunit PR 


A 

u 


IUjo.Z 


737 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


4e-14 


60.4 


740 


WD40 


WD domain, G-beta repeat 


5.6e-14' 


59.9 


745 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 


3.8e-13 


46.9 
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SEQ ED 
NO: 


PFAM NAME 




p- value 


rrAM 






finger) 






749 


mito carr 


Mitochondrial carrier proteins 


4.5e-67 


232.8 


750 


DUF27 


Domain of unknown function DUF27 


4.5e-12 


53.5 


751 


SH3 


SH3 domain 


3.6e-17 


70.5 


752 


HMG box 


HMG (high mobility group) box 


8.6e-13 


55.9 


753 


SPRY 


SPRY domain 


5.9e-05 


23.3 


754 


GTPJODC 


Cell division protein 


7.5e-153 


521.2 


755 


mito carr 


Mitochondrial carrier proteins 


3e-88 


305.4 


756 


TSPN 


Thrombospondin N-tenninal -like 
domains 


8.1e-58 


205.5 


757 


BTB 


BTB/POZ domain 


5.7e-23 


89.7 


759 


zf-C2H2 * 


Zinc finger, C2H2 type 


1.2e-12 


55.4 


760 


NSF 


NSF attachment protein 


6.4e-127 


435.1 


762 


Ribosomal S14 


Ribosomal protein S14p/S29e 


2.1e-06 


24.8 


765 


ThiF_farnily 


ThiF family 


1.7e-39 


144.6 


766 


DnaJ 


DnaJ domain 


3.9e-36 


133.5 


768 


tRNA-synt_2b 


tRNA synthetase class II 


9.1e-81 


281.7 


769 


ldl_recept a 


Low-density lipoprotein receptor 
domain 


0 


1404.5 


770 


WD40 


WD domain, G-beta repeat 


2e-21 


84.6 


771 


LRR 


Leucine Rich Repeat 


3.8e-06 


33.9 


774 


SNF2 N 


SNF2 and others N-terminfll domain 






776 


VPS9 


Vacuolar sorting protein 9 (VPS9) 
domain 


l.le-30 


115.4 


111 


VPS9 


Vacuolar sorting protein 9 (VPS9) 
domain 


l.le-30 


115.4 


778 


VPS9 


Vacuolar sorting protein 9 (VPS9) 
domain 


l.le-30 


1 1 S A 


779 


zf-C3HC4 


Zinc finder C3HC4 tvne fRTNTG 
finger) 


J . 1 C- v O 


31 n 


781 


cadherin 


Cadherin domain 


5.6e-l 13 


388.7 


783 


HECT 


HECT-domain (ubiquitin- 
transferase). 


4^2e-31 


116.8 


785 


sushi 


Sushi domain (SCR repeat) 


1.8e-60 


214.3 


786 


sushi 


Sushi domain (SCR repeat) 


1.8e-60 


214.3 


788 


vwa 


von Willebrand factor type A domain 


1.9e-52 


187.7 


790 


rrm 


RNA recognition motif. 


2.8e-20 


80.8 


791 


Collagen 


Collagen triple helix repeat (20 
copies) 


0 00097 


Q 7 


792 


pkinase 


Eukaryotic protein kinase domain 


0.023 


12.4 


795 


zf-C2H2 


Zinc finger, C2H2 type 


6.5e-95 


328.7 


796 


adh short 


short chain dehydrogenase 


4.1e-05 


-7.3 


799 


SAICAR_synt 


SAICAR synthetase 


6e-125 


428.5 


805 


WD40 


WD domain, G-beta repeat 


4e-65 


.229.8 


806 


ZU5 


ZU5 domain 


4.7e-37 


136.5 


807 


WD40 


WD domain, G-beta repeat 


0.016 


21.8 


808 


WD40 


WD domain, G-beta repeat 


0.0041 


23.8 


809 


plcinase 


Eukaryotic protein kinase domain 


2e-31 


1172 


810 


vwa 


von Willebrand factor type A domain 


1.9e-52 


187.7 


814 


zf-C2H2 


Zinc finger, C2H2 type 


4.5e-83 


289.4 


815 


zf-C2H2 


Zinc finger, C2H2 type 


6e-74 


259.1 


817 


myosin head 


Myosin head (motor domain) 


1.5e-176 




818 


GSPII_E 


Bacterial type II secretion system 


0.012 


1 1 S 
1 1 .J 






protein 






819 


PDEase 


3'5-cycIic nucleotide 
phosphodiesterase 


l.le-74 


215.5 


821 


PH 


PH domain 


0.00025 


20.5 


822 


CNH 


CNH domain 


0.00015 


-24.7 


827 


rrm 


RNA recognition motif. 


1.5e-06 


352 
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CPA TT\ 

NO: 


ppa iw ma ft/nr 




p-vaiue 


T>T? x 71 if 

PFAM 


829 


HMG box 


HMG (high mobility group) box 


7.Se-34 


325.8 


830 


RasGEF - 


RasGEF domain 


2.2e-102 


353.5 


831 


CNH 


CNH domain 


3e-l 18 


406.2 


832 


mito carr 


Mitochondrial carrier proteins 


3.7e-37 


130.3 


833 


PX 


PX domain 


2.7e-19 


77.5 


837 


Y_phosphatase 


Protein-tyrosine phosphatase 


1.6e-263 


888.8 


838 


ank 


Ank reneat 




°l 1 ^ 


840 


ank 


Ank reneat 


5.8e-38 


uy.xj 


842 


Ribosomal_L15e 


RibosomalL15 


4.8e-131 


448.8 


843 


SNF 


SnHinm •npnrntrnncmil+pr cutrmftrtw 
OUUlUlxx.ilGLU UUuliOllllLlCr o^lllUvU ICI 

family 




i9m si 


845 


PeDtidase Ml 6 


Insulinase fPentidase familv Ml 6^ 


4.7e-67 


916 7 


848 


EF1BD 


EF-1 guanine nucleotide e\* change 
domain 


2.2e-56 


700 7 


849 


zf-C2H2 


Zinc finger, C2H2 type 


1.5e-122 


420.5 


850 


zf-C2H2 


Zinc fineer C1W1 tvne 


?e-67 


717 4 

ZJ / .f 


852 


SIS 


SIS -domain 


_J .OC'JU 


1 10. 0 


853 


RhoGAP 


RhoGAP domain 




X JO.O 


854 


PDZ 


PF)7 rinmain f Alert lrnrm/n nc T^T-TR r\r 

k XJlmi UUflilClLli ^rVloVJ MJUWU do JL/xlXV. VJI 

GLGF). 




4r^ 7 


856 


ACOX 




0 1p-9^ 


1 


858 


efhand 


FF hand 

JL>X J1CU1U 




74 A 


860 


homeobox 


Homeobox domain 


4e-22 


86.9 


862 


TFTTF hpfa 
xx x.i.r ucui 


i raijbcripLion initiation iacior llf 5 
beta 






866 


A2M 


A lnha-*?-TT!Jtf*Tficrlnhitlin fhmilv 




70 0 


867 


MoCF_biosynth 


Molybdenum cofactor biosynthesis 
protei ■ 


5.8e-205 


694.3. 


868 


EGF 


EGF-like domain 


4.1e-22 


86.9 


869 




JC-Vjr-llKC UUITlalll 


1 1- 00 


BO 0 
56.0 


R71 


PT-PT COf 


r nubpnatiuyiinu5iLui*-spcciuc 
nho*in}m1inacp 


7 7*> 0^ 


IOC A 


872 


UCH-2 


T TKinilitin parhovvl-tprminal 

U Lflk^UJ LL1I \*ai UU AY 1 LwiXllilial 

hvdrolase familv 


1 1p-7ft 
1 . 1 c-zv 


R9 1 


874 


SH3 


SH3 domain 


2.2e-14 


61.2 


877 


SH3 


ST-T1 Hnmain 


R fip-Qfl 

O.OC*7U 


1117 


882 


KRAB 


KRAB box 




167 A 


885 


ank 


Ank repeat 


7.1e-07 


36.3 


886 


hifYntpriTi H 

UlUL/Lyl III x\ 


DiupLCI ui-ucjjciiucih arumaLic aiuino 

acid h 


n 




887 


GTP EFTU 


Plnrmatinn fartnr Tit fnmilv 
X-fJUligciLjtjll laiAUl l u lalxxxiy 


A Qp_190 


J.17 ^ 


888 


zf-C3HC4 


7inc finaer tvnp fi?TNJfi 
fineer^ 


1 .DC" J H 


^1 4 


889 


zf-C2H2 


Zinc fineer C2H2 tvne 


3.7e-92 


11 Q 6 


890 


»g 


Immunoglobulin domain 


3.8e-06 


24.8 


892 


PTR2 


POT familv 

x 1 xttixxxx y 






893 


Sulfatase 


Sulfatase 




77^ 9 
z / J.Z 


894 


Sulfatase 


Sulfatase 


3.5e-78 


273.2 


895 


7tm 1 


7 tran^m pmhranf* rpcp*ntrvr /VVinH niacin 

family) 






896 


Giver* hvdrn 1 1 


filvf^ocvl hvHmlacpc fciTriilv 11 
\j iy\s\Joy 1 iij'unjlasco xailiUy Jl 


n 

u 


1 977 "3 


897 


chromn 

via. ui 1 1 


MOdifierS 


■3 Op 
J.7C-UU 


96 n 


898 


Cbl N 


domain 


1 9p_97^ 

1 Zc-i / J 


099 A 


899 


vwa 


von Willebrand factor type A domain 


5.5e-32 


119.7 


900 


WD40 


WD domain, G-beta repeat 


2.7e-07 


37.7 


901 


zf-C2H2 


Zinc finger, C2H2 type 


4e-156 


532.1 


903 


ras 


Ras family 


6.6e-101 


348.6 
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CPA rr\ 

NO* 


■plTAA/f W A TWIT 




p-value 


PFAM 


904 i 


Armadillo seg 


Axtnadillo/beta-caten in-like rpnpats 


1.1 C \JD 


JJ.O 


906 


FH2 


Form in Homology 2 Domain 


4 5e-l 12 


^RS 7 

JOJ. / 


907 


Cytidy ly ltransf 


Cyti dy ly Itransferase 


1 .4e-05 


90 ^ 


908 


pkinase 


Eukarvotic Drotein kinase domain 


1 ?p-64 


99R 9 


909 


pkinase 


Eukarvotic orotein kinase domain 


R Se-7ft 

O.JC / V 


94S ^ 


910 


pkinase 


Eukarvotic Drotein kinase domain 


9 Qp-4? 


1 R 

J. Jj.O 


911 


pkinase 


Eukarvotic Drotein kinase domain 

uuivhi jvuw i/ivwUi rvt-iid4j\^ UxsJ.ll a Ul 


1 ?P-^S 


1 ^1 R 


912 


PHD 


PHD- fin ger 






913 


PHD 


PHD-fineer 


J.JC~1U 


OU.J 


916 


filament 


Intermediate filament nmteins 


Q7e-l?l 




917 


LM 


LIM domain containing nrnteins 


S Oe-1 S 

J.7C" 


S7 0 


918 


SAM 


SAM domain (Sterile alpha motif) 


4.3e-16 


66.9 


922 


A cvlohosnh atase 


A c vl oh osnh atas p 


9 Qp-fi ^ 


993 (\ 


924 


IP" 


TmTnnnocrlolmlin Hnmain 






925 


Acvl-CoA dh 
nt-ji vun till 




z.*fe-i ji 


AAQ C 


927 


7tm_l 


7 transmembrane receptor (rhodopsin 

■fam f Ivr^ 

l£Xlllliy ^ 


2.9e-45 


145.9 


928 


glob in 


Olohin 

VwJ iWUlil 




1 50.7 


929 


<*i i put tr 

JUgui U 


Si 1 car (nY\{\ ofTi pr^ trancrwkrtpr 
OUgfll ^allU UUlCt ^ Li alldUVl ICI 




Oo.o 


932 


On! la pen 


Oollacrpn trinlp HpIiy rpnpnt Oft ' 

V^vllagvll U1JJ1C 11CI1A iCUCclL 

conies^ 


ft nnno7 


0 7 


933 


HMG box 


T-TMC? frnch mohilitv oronn\ Hoy 
-i i-i. v i vj niwiiiijr el uuu^ uua 


7 Rp-^4 


19^ R 


934 


SEA 


SEA domain 


ft ftft9l 


94 7 

Z*T. / 


935 


ras 


Ras familv 


O.HC"J7 . 


9H0 9 


936 


CH 


("^aloonin rtOmoloov domain 




R^ 7 


937 


voltaee CLC 


Volta<*p tratpH rrtloriHp phannelc 




(\lf* ft 


938 


homeobox 


T-JompohoY domain 

J. JAJlll^VJUUA UUIllulll 




QR ft 
70.U 


940 


pkinase 


pjikflrvofip nrotpin IfinaQp domain 


7.7C-JO 


90^ 9 


942 


Mvosin tail 

1TA J v Jill llf&A 


Mvosin tail 


3 7p-ftQ 


"^R 9 


943 


zf-C2H2 


Zinc finder P9H9 tvne 


9 9p-Q9 


^9ft ^ 


945 


Clat adaptor s 


-Clathrin adantor comnlpv small rhain* 

VlUUlllil UVJULFIUI wU111L>1*>JV 311 X Q-l 1 VI 111 II f 




9^<l ft 


946 


sugar tr 


Susar f and other^ transoorter 


ft ftl7 


-1 99 R 


947 


tRNA-synt_le 


tRNA synthetases class I (C) 


0.00097 


15.6 


948 


PHD 


PHD-flnfrer 


9 9p-1 7 


71 9 


951 


en trap tr 


Snoar ( and otTipr^ tT*nncT%Ar+pT* 

OUgfli ^OllU y U CUJ^L/L/I LCJ 


ft ftftR9 


1 T3 0 


952 


mito carr 


AAitopnondnal r*?iTTiAT* T*irn+pinc 
1V1.1LUV1HJ11U1 ial oil ici JJiUlCllla 


1 7p-^A 


1 QO 7 


953 


mvb DNA- 

All Y U -L ^ A 

binding 


T^vH-lilrp r*)7^ A— hindino Hnmain 
xviy \j~ius& J-^i>/\~L/HiLiiiig uuiiiaiii 


A ^p„"7ft 


fin i - 


955 


ketoacvl-svnt 


Reta-ketoacvl ^vnthasp 






957 


aldo ket red 


Aldo/keto rednrtaQP familv 


l .jc-70 


14ft R 


959 


Kelch 


Kelch motif 


0.02 


20.8 


961 


Tas 


T?as familv 


7 9p "50 


1111 
111.1 


964 


homeoboY 


MompoliOY domain 

JL UU A UL/I11CUJ1 




oO.J 


965 


PH 


PH domain 




Rft 0 
0U.7 


966 


zf-C3HC4 


Zinc fineer ClUCd tvnp fRFNJG 
*-iXxi\* } VyjxiVrfn lytic i^jvii^ivj 

finger) 


9 9p_ft0 


14 7 


967 


Ribosomal L29 


Ribosomal nrntein 


l .UC" I J 


fiS ft 


970 


FAD_binding_2 


FAD binding domain 


8.9e-47 


166.6 


971 


rve 


Tnteora*;p rnrp Hnmnin • 


ft ftftfti ^ 


IO R 


972 


Glvcos transf 2 . 


Glvcosvl transfprasPQ 


9 1p-91 


R4 ^ 

On.J 


974 


Ribosomal L10 


Ribosomal nrotein Tin 


^ ^p-4R 


1 71 6 
1 r j.D 


975 


7tm 1 


/ u aiiAjLLL^iiiuiauc rcL>cpiUl ^lllUUupblll 

familv^i 


1 fip-17 
1 .Oc-o / 




976 


zf-C4 


Zinc finger, C4 type (two domains) 


2.1e-52 


178.5 


977 


zf-C2H2 


Zinc finger, C2H2 type 


6.6e-150 


511.4 


978 


FTHFS 


Formate-tetrahydrofolate ligase 


0 


1367.2 


982 


Renal_dipeptase 


Renal dipeptidase 


1.3e-73 


258.0 


984 


A deaminase 


Adenosine/AMP deaminase 


2.6e-05 


-48.6 
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TABLE 5 



SEQ ED NO: 
of full-length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full-length 
peptide 
sequence 


SEQ ID NO: 
of con tig 
nucleotide 
sequence 


SEQ ED NO: 
of con tig 
peptide 
sequence 


Priority docket 
number_correspondin 
g SEQ ID NO; in 
priority application 


SEQ ID NO: in 
U.S.S.N. 09/496,914 


i 
l 




jyoy 


T0^3 

zyD^ 


/5/ULrZ 1 


1 ? A 


z 


yoo 


1 Q7A 

iy f\) 


zyj4 


7o7CLfz 2 


ZZ3 


o 


QR7 


1 071 

iy / 1 


zyjj 


/o/Uli^Z 3 


1 oo4 


A 
4 


OCR 

yoo 


1 077 

iy /z 


oner 

zyjo 


/o/UlrZ 4 


7 1 T3 

Z123 


< 


QRO 

yoy 


iy i j 


zy j / 


/o/UirZ o 


Z3 13 


0 


qoa 
yyu 


i on a 
iy /4 


zyr>o 


/o /UUrZ_o 


3Zo4 


7 
/ 


001 

yy i 


1 QK 

iy / j 


zyjy 


/o/dr'Z / 


33Z4 


R 
o 


QG7 

yyz 


1 Q7£ 

iy /o 


OOAA 

zyou 


/o/Urz o 


oloz 


0 

y 


003 

yyj 


1 077 

iy / / 


zyoi 


/o/CIrZ y 


£7 1 A 
OZIU 


i a 
J u 


yy4 


1 Q7Q 


zyoz 


7070Tt51 1 A 


6213 


1 1 
1 1 


yyj 


i 0*70 

iy /y 


zyoi 


/6/Urz 11 


6257 


i 7 

1Z 


OQ£ 

yyo 


1 OCA 

iyou 


zyo4 


/o/Ulrz 12 


6294 


ii 


QQ7 

yy / 


1 OR 1 

iyoi 


zyoj 


7o/ClrZ 13 


6294 


14 


ooc 

yy© 


1 OCT . 


zyoo 


7o7L,Ur2 14 


6330 


i_> 


QOO 

yyy 


iyoi 


2967 


7o7Clr2_15 


6364 


lo 


i aaa 


1 GQA 

iys4 


2968 


787CIP2_16 


6455 


1 7 


1 AA1 


i no c 

lyoD 


29oy 


7o7CIP2_17 


6486 


16 


i AA7 
1UUZ 




zyvo 


787ULP2 18 


6503 


iy 




1 00*7 

lyo/ 


2971 


787C1F2 19 


6528 


ZU 


1UU4 


lyoo 


OOT> ■ 

2972 


787C1P2_20 


6572 


7 1 
Zl 


i aac. 


1989 


OOT3 

2973 


787C1F2_21 


6578 


zz 


1 AA< 

lUUO 


1 OOA 

iyyu 


zy /4 


787Clr2 22 


6593 


Zi 


i AA7 
luu / 


1 OQ1 

iyy l 


Z9/5 


787LLP2 23 


*;/:n'3 

6603 


7/1 

Z4 


l aao 


lyyz 


2976 


787C1P2 24 


6603 


7^ 
Z3 


1 AAO 

iuuy 


1 OQ7 

iyy j 


2977 


787CJLP2 25 


6679 


zo 


ini n 
iUlU 


1 OQA 

iyy4 


0070 

Z978 


787CLP2 26 


6744 


z / 


1A11 

1U 1 i 


1 OGC. 

iyyj 


0070 

Z9 /9 


7o7ClrZ 27 


6762 


Zo 


inn 
JUiZ 


1 DQ/C 

lyyo 


2980 


7o7ClP2 28 


6770 


7G 
zy 


1 Al 1 

1U13 


1 OOT ■ 

iyy / 


Z9ol 


7b7CIP2_29 


6770 




1U14 


iyy© 


Z9oZ 


7o7CIP2 30 


6787 


D 1 


1 A1 < ■ 

1U1 j 


1 ooo 

iyyy 


Z9oi 


787C1F2 31 


6858 


*37 
iZ 


i ai a 


7AAA 

zuuu 


zyo4 


787CLP2 32 


6866 


JJ . 


1 AIT 
1U1 / 


• 7AA1 


zyoD 


787CLrz 33 


6938 


1A 

JH 


1 A1 R 

lulo 


7AA7 
ZUUZ 


Z9oo 


787Clrz 34 


6938 




1 AI O 

luiy 


7AA2 
ZUUi 


zyo/ 


757C1P2 35 


6977 


JO 


1 AO A 


7 AAA 
ZUU4 


zyoo 


/o/^UrZ 30 


7AA1 
700 1 


37 


1071 
luZ 1 


700^ 


7QC0 

zyoy 


707f" , T"D7 "27 


7AA7 


3R 

JO ■ 


1077 
iUZZ 


7006 


70QA 

zyyu 


/o/L/lrZ 3o 


7AA/1 




1073 
IUZj 


7007 
ZUU / 


7001 

zyyi 


/o/dJrZ ^y 


7AAC. 


40 


1074 


700R 
ZUUO 


7Q07 

zyyz 


75701157 ytA 

/o/drZ 4U 


7AA#C 
/UUO 


41 


107S 


7000 

zvuy 


7001 

zyyj 


/o/UlPZ 41 


7AAO 


HZ 


107fi 


701 0 


zyy4 


/o/ULrZ 4Z 


7A1 yl 

/U14 


43 


1077 
iUZ / 


701 1 
Zu 1 1 


ooo< 
zyyj 


/O/UJLrZ 43 


7A7 1 


44 


1 A7R 
1UZO 


7017 
ZvlZ 


zyyo 


707OT"D7 A A 

/o/UUrZ 44 


/022 


*+-> 


1 A7Q 

juzy 


701 3 


O0O7 

zyy / 


/o/UlrZ_4o 


7057 


46 


1030 


2014 


2°98 


7R7PTP7 47 


/ UJ o 


47 


1031 


2015 


2999 


787CIP2 49 


7088 


48 


1032 


2016 


3000 


787CIP2_50 


70S9 


49 


1033 


2017 


3001 


787CIP2 51 


7182 


50 


1034 


2018. 


3002 


787CIP2 52 


7489 


51 


1035 


2019 


3003 


787CIP2 53 


7564 


52 


1036 


2020 


3004 


787CIP2 54 


7566 


53 


1037 


2021 


3005 


787C1P2 55 


7587 
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54 


1038 


2022 


3006 


787CIP2 56 

1 KJ 1 \sXX At ~>\J 


7S91 


55 


1039 


2023 


3007. 


787C1P2 57 

. 1 KJ 1 ^SA-A At tj I 


7600 


56 


1040 


2024 


3008 


787C1P2 58 


7604 


57 


1041 ' 


2025 


3009 


787CIP2 59 

1 KJ I \tfAX, At m/ if 


7612 


58 


1042 


2026 


3010 


787CIP2 60 


7613 


59. 


1043 


2027 


3011 


787CIP2 61 


7615 


60 


1044 


2028 


3012 


787CIP2 62 

i KJ / \— ' AA At \IAt 


7616 


61 


1045 


2029 


3013 


787CIP2 63 


7617 


62 


1046 


2030 


3014 


787CIP2 64 


7623 

/ KJA.J 


63 


1047 


2031 


3015 


787CDP2 65 


762 S 


64 


1048 


2032 


3016 


787CIP2 66 

/Of wJUT At \J\J 


7fi2S 


65 


1049 


2033 


3017 


787CEP2 67 


7630 


66 


1050 


2034 


3018 


787CIP2 68 

f KJ I V^JLL \J\J 


7638 


67 


1051 


2035 


3019 


787CIP2 69 

i kj t v^aa At kj y 


7640 


68 


1052 


2036 


3020 


787CIP2 70 

I KJ 1 W'JJI X. 1 U 


7670 


69 


1053 


2037 


3021 


7S7CIP2 71 

1 KJ J VsJUL At 1 X 


7676 


70 


1054 


2038 


3022 


787CIP2 72 


7688 


71 


1055 


2039 


3023 


787CIP2 73 

1 KJ 1 Vw'J.JT a* / J 


7690 


72 


1056 


2040 


3024 


787CIP2 74 


7700 


73 


1057 


2041 


3025 


787CIP2 75 

/ KJ / At i -J 


7774 


74 


1058 


2042 


3026 


787CIP2 76 


7784 


75 


1059 


2043 


3027 


787CIP2 77 

1 KJ 1 \-fXX At f 1 


778S 


76 


1060 


2044 


3028 


787CTP2 78 

I KJ I \t*XX At / (J 


1 1 J At 


77 


1061 


2045 


3029 


787CTP9 70 

1 O 1 \tSXXZ At > 


7798 


78 


1062 


2046 


3030 


787CTP9 80 

1 KJ l K^Xa <u OU 


7807 


79 


1063 


2047 


3031 


787CTP2 81 

1 O 1 K^sXX At Ol 


7810 


80 


1064 


2048 


3032 


787CTP2 82 

1 KJ I \_/JLL At KJ At 


7812 

/ O 1 At 


81 


1065 


2049 


3033 


787CTP9 83 

IK} / K^yJJi At OJ 


7816 


82 


1066 


2050 


3034 


787CIP9 84 

f O / K-s±£ A* 0*T 


7896 


83 


1067 


2051 


3035 


787CTP9 85 

I KJ 1 \^XX At KJ *J 


7842 


84 


1068 


2052 


3036 


787CIP2 86 

I KJ I At KJ \J 


7850 


85 


1069 


2053 


3037 


787CTP7 87 

/Of wJLT At O / 


786S 


86 


1070 


2054 


3038 


787CTP2 88 

/ K> i \s±X At O O 


7887 

/ OQaj 


87 


1071 


2055 


3039 


787CTP2 89 

t KJ t V/Ji. A* KJ J 


7891 


88 


1072 


2056 


3040 


787CIP2 90 

t KJ I JUL At \J 


7892 


89 


1073 


2057 


3041 


787CIP2 91 

1 KJ I \*rXX At Sti. 


7896 


90 


1074 


2058 


3042 


787CTP2 92 

1 KJ 1 \*rAX At At 


7896 


91. 


1075 


2059 


3043 


787CIP2 93 

1 KJ I \_sJLL At ■ ^ tJ 


7907 . 


92 


1076 


.2060 


3044 


787C1P2 94 

1 KJ 1 V^JLJL At y J 


791^ 


93 


1077 


2061 


3045 


787CTP2 95 

I KJ 1 \tfXX At ? %J 


7914 


94 


1078 


2062 


3046 


787CTP9 9fi 

I O t \-sAA At y\J 


791 S 


95 


1079 


2063 


3047 


787CTP9 97 


7990 

/ y aWj 


96 


1080 


2064 


3048 


787CTP9 98 

1 KJ 1 \_/JJL At yO 


7991 
/ y a\ l 


97 


1081 


2065 


3049 


787CTP2 99 

ft) / wJJT At J 7 J 


7994 


98 


1082 


2066 


3050 


787CIP2 100 

1 KJ 1 V-^JUl At X \J\J 


i y£t i 


99 


1083 


2067, 


3051 


7S7CIP2 101 

I KJ J \SAA At A \J A 


7999 

/ 7A7 


100 


1084 


2068 


3052 


787CIP2 102 

1 KJ t \m*XX At X UX# 


7937 


101 


1085 


2069 


3053 


787CIP2 103 

1 KJ 1 y**XA At X \J—J 


7940 


102 


1086 


2070 


3054 


787C1P2 104 

1 KJ 1 V_sJjL At X V*T 


7949 i 


103 


1087 


2071 


3055 


787CIP2 105 

1 KJ I \tfXX. At A \J*J 


7944 


104 


1088 


2072 


3056 


787CIP2 106 

t KJ i VJLL A- X \J\3 


79S1 


105 


1089 


2073 


3057 


787CIP2 107 


7951 


106 


1090 


2074 


3058 


787CIP2 108 


7962 


107 


1091 


2075 


3059 


787CTPO 109 

/ KJ / \^aX At X KJ y 


7964 


108 


1092 


2076 


3060 


787CTP2 110 


7977 


109 


1093 


2077 


3061 


787CJP2 111 


7978 


110 


1094 


2078 


3062 


787CIP2_112 


7980 


111 


1095 


2079 


3063 


787CIP2_1 13 


7982 


112. 


1096 


2080 


3064 


787C1P2 114 


8000 


113 


1097 


2081 


3065 


787CIP2 115 


8003 
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114 


1098 


2082 


3066 


787CIP2 116 


8004 


115 


1099 


2083 


3067 


787CEP2 117 

1 V f \-sJUL 4m lit 


8007 


116 


1100 


2084 


3068 


787CIP2 118 


8008 


117 


1101 


2085 


3069 


787CIP2 119 


8009 


118 


1102 


2086 


3070 


787CIP2 120 


8013 


119 


1103 


2087 


3071 


787C1P2 121 


8017 


120 


1104 


2088 


3072 


787CIP2 122 

f \J ' ^mrXX *m XJLm£* 


8018 


121 


1105 


2089 


3073 


787CIP2 123 


8021 


122 


1106 


2090 


3074 


787CIP2 124 

/ \J 1 VlX 4m X 4m^ 


8022 


123 


1107 


2093 


3075 


787CIP2 1?5 

f O/^ilX X4mJ 


R023 


124 


1108 


2092 


3076 


787CIP2 126 


8023 


125 


1109 


2093 


3077 


787CIP2 127 

1 \J 1 ^w^XX 4r X4m / 


8024 


126 


1110 


2094 


3078 


787CIP2 198 


8026 


127 


1111 


2095 


3079 


787CDP2 129 

f O f \mfLX 4m L4mmT 


8028 


128 


1112 


2096 


3080 


787CTP2 130 

1 O 1 \mfX4 4m XJ\J 


8036 


129 


1113 


2097 


3081 - 


787CIP2 131 


8038 


130 


1114 


2098 


3082 


787CTP2 132 

1 \J 1 ^>XX 4m XJ4m 


804 S 


131 


1115 


2099 


3083 


787CIP2 133 


804 S 


132 


1116 


2100 


3084 


787CTP2 134 

/ O / V_/1JL 4, 1 ,?*T 


804 R 


133 


1117 


2101 


3085 • 


7870TP? 13*i 

f O I V-'XX 4m XJJ 


804 R 


134 


1118 


2102 


3086 


787CTP? 136 


8052 


135 


1119 


2103 


3087 


787CTP9 137 

/ O / ^XX 4m XJ I 


R0S3 


136 


1120 


2104 


3088 


787CTP2 138 

t \J t \_/XX 4m X J o 


8055 


137 


1121 


2105 


3089 


787CTP9 139 


ROSO 


138 


1122 


2106 


3090 


7R7CTP? 140 


0\J\J 1 


139 


1123 


2107 


3091 


787CTP? 141 


R0fi9 

O uVi 


140 


1124 


2108 


3092 


787CTP2 142 


8063 


141 


1125 


2109 


3093 


7R7C TP9 1 43 

/O'vilii X^J 


R064 


142 


1126 


2110 


3094 


787CTP9 144 


O UU J 


143 


1127 


2111 


3095 


787CIP2 145 

/Of V^J_L 4m X * *J 


8068 


144 


1128 


2112 


3096 


787CTP? 146 


R06Q 

OUU7 


145 


1129 


2113 


3097 


787CTP9 1 47 

/Of V^JLX — It/ 


R070 


146 


1130 


2114 


3098 


787CTP2 148 

I \> 1 V^XX mU X 


R074 


147 


1131 


2115 


3099 


787CTP2 149 


S076 


148 


1132 


2116 


3100 


787CIP2 150 

i O f KtfXA. 4m X %J \J 


8077 


149 


1133 


2117 


3101 


7S7CTP9 151 

f O / VxXI ■!> 1J1 


S07R 
ou / o 


150 


1134 


2118 


3102 


787CTP9 15? 

/Of V^XX 4m X <JX> 


R079 


151 


1135 


2119 


3103 


787CIP2 153 

/ O / n_^XX 4m X J-/ 


R0R7 

OUD / 


152 


1136 


2120 


3104 


787CTP9 154 

/Of VsXX X X JT^ 


R0Q1 


153 


1137 


2121 


3105 


7R7CTP9 1S*J 


R 1 00 

O X \J\J * 


154 


1138 


2122 


3106 


7K7CIP9 1S6 
/ o / v^xx x, i jy 


R10S 

O X uJ 


155 


1139 


2123 


3107 


7R7CTP2 1 S7 


RIOfi 

OIUO 


156 


1140 


2124 


3108 


7R7CTP? 1 SR 

/o/ v^xxr x i jo 


R1 OR 

O 1UO 


157 


1141 


2125 


3109 


787CTP2 159 

/of \_^xx x> x 


R109 

O XV? 


158 


1142 


2126 


3110 


787CIP2 160 


R1 10 

Ol 111 


159 


1143 


2127 


3111 


787CIP2 161 

tOf v_xXX 4, 1U1 


Rl 17 


160 


1144 


2128 


3112 


787CIP2 162 

/Of V-'XX 4^ X \J4r 


Rl 1ft 


161 


1145 


2129 


3113 


787CIP2 163 


Rl iR 

OX X o 


162 


1146 


2130 


3114 


787CIP2 164 

/Of V-/JLX mam X V/ I 


R194 


163 


1147 


2131 


3115 


787CIP2 165 

f O f 4m X UJ 


R19S 


164 


1148 


2132 


3116 


787CIP2 166 

/Of V_xXX X X uu 


R197 


165 


1149 


2133 


3117 


787CIP2 167 


R13? 

O X J£m 


166 


1150 


2334 


3118 


7S7CIP2 168 


R135 

OX J J 


167 


1151 


2135 


3119 


787CIP2 169 


R137 

O X D t 


168 


1152 


2136 


3120 


787CIP2 170 

I \J 1 \mfXJ- mmt ± f V 


Rl 39 


169 


1153 


2137 


3121 


7S7CIP2 171 


8140 


170 


1154 


2138 


3122 


787CIP2 172 


8140 


171 


1155 


2139 


3123 


787CEP2_173 


8140 


172 


1156 


2140 


3124 


787C1P2 174 


8141 


173 


1157 


2141 


3125 


787CIP2_175 


8147 
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174 


1158 


2142 


3126 


7R7PTP? 1 76 


R149 


175 


1159 


2143 


3127 


787PIP? 1 77 
tot z. iff 


Rl 50 


176 
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917 


1901 


2R8S 


3869 

J OU7 


787P1P2P 1 -S 1 


7029 


918 


1902 


2886 


3870 


787PTP2P IS? 
/ 0 / wir u^. 


788 S 


919 


190*3 


2887 


"3871 


787PTP2P 1 S3 


R143 


920 * 


1904 


?8SR 
~ooo 


3872 

DO I 


787PTP2P 1S4 

/O/UuXv XJ*r 


8143 


921 


1905 


2889 


3873 

JO / J 


787PTP2P 1SS 


8234 


92? 


1906 


2890 


3874 


7R7PTP2P 1 S6 


8463 


923 


1907 


2891 • 

AO? I 


387S 

JO f J 


787PTP7P 1 S7 


8467 

OHO / 


924 


1908 


2892 


3876 

J 0 / V 


7R7PTP2P 1SR 


RS40 


925 


1909 


2893 


3877 


787PTP2P 1S9 


R600 


926 


1910 


2894 


3878 
j 0 / 0 


787PTP2P 160 


96S6 


927 


191 1 


289S 


3879 

JO/7 


787PTP2P 161 


9669 


928 


1912 


2896 


3880 


787PTP?C 162 


969S 


929 


1913 


2897 


3881 


787PTP2P 163 


9744 


930 


1914 


2898 


3882 


787PTP9P 164 


QR49 
70H7 


931 


1915 


2899 

4077 


3883 


787PTP7D 1 


41 80 


932 


1916 


2900 


3884 


787PIP2D 2 


41 Rl 


933 


1917 
171/ 


2901 


188S 


7X7PTP2D 3 


4^14 


934 


1918 


2902 


3886 


787PTP2D 4 

IOI KsXX «T 


4S00 


935 


1919 


2903 


3887 

J OO / 


787PTP2D S 


S6S1 


936 


1920 


?904 


3888 
j 000 


787PTP2D 6 


S691 


937 


1921 


2905 


3889 
j 007 


787PTP2D 7 


SRR1 

JOO 1 


938 


1922 


2906 


3890 


7R7PTP2D 8 


S88? 


939 


1923 


2907 


3891 


787C1P2D 9 


6709 


940 


1924 


2908 


3892 


7R7CIP2D 10 


6719 

U / 17 


941 


1925 


2909 


3893 


787CIP2D 11 

IOI wil ■t~\~s X X 


8130 

O XJ\J 


942 


1926 


2910 


3894 


787PTP2D 1? 


RR63 


943 


1927 


2911 

*f7 J 1 


3895 


787PTP2D 13 


890? 


944 


1928 


2912 


3896 


787PTP2D 14 

'Of V-/.LX X+XJ X *T 


916? 

7iu^ 


945 


1929 


2913 


3897 


787CTP2D IS 

IOI W1X ' 1 J 


9197 

717/ 


946 


1930 


2914 


3898 


787CTP2D 16 


921 S 


947 


1931 


2915 


3899 

J077 


787PTP2D 17 

IOI £.X*f X I 


9?3? 


948 


1932 


2916 


3900 


7R7PTP7D 18 


0?6? 


949 


1933 


2917 


3901 


7S7CIP2D_19 


9369 


950 


1934 


2918 


3902 


787CIP2D_20 


9371 


951 


1935 


2919 


3903 


787CIP2DJ21 


9516 


952 


1936 


2920 


3904 


787CEP2D22 


9601 


953 


1937 


2921 


3905 


787CIP2D_23 


9731 
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954 


1938 


2922 


3906 


7R7PTP2r) 74 


9733 


955 


1939 


2923 


3907 


787CTP2D 2S 


9769 


956 


1940 


2924 


3908 


7S7CIP2D 26 


9804 


957 


1941 


2925 


3909 


787CTP2D 27 


9816 


958 


3942 


2926 


3910 


787CIP2D 28 


9844 


959 


1943 


2927 


3911 


787CTP2D 29 


9924 


960 


1944 


2928 


3912 


787CIP2D 30 


9936 


961 


1945 


2929 


3913 


787CIP2D 31 


10163 


962 


1946 


2930 


3914 


787CTP7D 32 


10165 


963 


1947 


2931 


3915 


787CTP2D 33 


10165 


964 


1948 


2932 


3916 


7R7CTP2D 34 


10244 


965 


1949 


2933 


3917 


787CTP2D 3S 


10278 


966 


1950 


2934 


3918 


7R7CTP2F 1 


4251 


967 


1951 


2935 


3919 


787CTP2E 2 


5310 


968 


1952 


2936 


3920 


787CTP2E 3 


5697 


969 


1953 


2937 


3921 


787CIP2E 4 


5731 


970 


1954 


2938 


3922 


787CTP2E 5 


5733 


971 


1955 


2939 


3923 


787CIP2E 6 


5734 . 


972 


1956 


2940 


3924 


7R7CTP2F 7 


5740 


973 


1957 


2941 


3925 


787CTP2F R 

/ O / VwXX ifaiJ o 


76S7 


974 


1958 


2942 


3926 


787C1P2E 9 


9*572 

7J f 


975 


1959 


2943 


3927 


787CTP2F 1 

f O / V->XX X-I 1 


1363 

1JUJ 


976 


1960 


2944 


397R 


7R7PTP9F 2 


4303 


977 


1961 


794S 


3929 


787CTP2F 3 




978 


1962 


2946 




7R7PTP9F 4 




979 


1963 


2947 


3931 


787CTP9F S 




980 


1964 


2948 


3932 


787CEP2F 6 


5161 


981 


.1965 


2949 


3933 


787CIP2FJ7 


5770 


982 


1966 


2950 


3934 


787CIP2F 8 


6855 


983 


1967 


2951 


.3935 


787CIP2F 9 


10026 


984 


1968 


2952 


3936 


787CIP2F 10 


10227 



TABLE 6 



SEQH) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A*=Alanine C=Cystcinc, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H-Histidint, 
I=Isoleucine, KMLysine, LHLeucine, M=Methionine, 
N«Asparagine, P-Proline, Q=Glutamine, R=Arginine, S^Serine, 
T=Threontne, V=Valinc,W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
possible nucleotide insertion 


2953 . 


A 


3 


324 


ISEHRIEASGNYLAQRLTSSFLRGLSSWKSNPLML 
CGWTILLTLT3VIVQGEP*GP\KGIPG\FHTNSSYPH 
WGTVAKPPAGD*DLLPAPGQEGTPLFTR*SLCTY 
CPID 


.2954 


A 


18 . 


467 . 


REELGKDLFDCTLYVLLKYDDFNADKHLALEEF 

YRAFQVIQLSLPEDQKLSITAATVGQSAVLSCAIQ 

GTLRPPIIWKJll^ 

KVTTTHVGNYTCYADGYEQVYQTHIFQVNVPPV 
IRVYPESQARRAG 


2955 


A 


3 


23 


. FYSAFLVADKG1VTSKHNNDTQHIWESDSNEFSV 
IADPRGNTLGRGTTIT^VSIPPSL 


2956 


A 


1 


493 


RTKTDW1LNLAVADLLLLFTLPFWAVNAVHGW 

VLGKIMCKITSALYTLNFVSGMQFLACISIDRYV 

AVTKVPSQSGVGKPCWnCFCVWMAAILLSIPQL 

VFYTVNDNARCIPIFPRYLGTSMKALIQMLEICIG 

FVVPFLMGVCYFITARTLMKMPNIKIS 


2957 


A 


703 


302 


EETGVREKRRERMKEKMWQNVLCCTLQTAVIL 
KiFQNKVLK^KNFFLSPLDTRK^KWKKWAGG 
PGAVAHACNPSTLGGRGGRITKSGDRDHPGQHG 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence ' 


Amino acid sequence (A=Alanine C=Cystcine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine,H=Histidine, 
I-Isoleutine, K^Lysine, L=Leucine, M=Methionine, 
N«Asparagine, P=Proline, Q=Glutamine, R«Arginine, S=Serine, 
T=Threonine, V^Valine, W^Tryptophan, Y=Tyrosine, 
XMUn known, * t =Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










ETRSLPACWAQWKSLALPVSRAPGRQGSLVVFP 
LP 




A 




1054 


CTKCKApCDTCWKNFCTKCKSGFYL 

NCPEGLEANNHTMECVSIVHCEVSEWNPWSPCT 

KKGKTCGFKRGTETRVREnQHPSAKGNLCPPTN 

ETRKCWQRKKCQKGERGKKGRERKRKKPNKG 

ESKEAIPDSKSLESSKEIPEQRENKQQQ 




A 

A. 


1 




LblVLLIbl lb 1 bHRLb VLWPI WYCCHCPTHLSAVMC 
VLLWALSLLQSILEWMFCSFLFSDVDSDNWCQIL 
DFLTAVWLIFLIXLVLCGrTLVLLVRIICGSQKMPL 
TRLYVTILLTGLVFLFCSLPLSIQ*FLLYWIEKDLD 

T"iT 
UL 


2960 


A. 


1194 


852 


EKRKTSYSQCLNSKQRNVSMRPSIW1HVHLKPPC 
RLVELLPFSSALQGLSHLSLGTTLP/V*GHLRFRL 
RNLPQSLRTVILPERNEEQNLQELSHNADKYQM 
GDCCKEEIDDSIFY 


2961 


A 


274 


2250 


EKGKVKDAGAEQWISLSLSCKGSWETQFSNHLN 
SLTPPTSVRRMPLITWTLLKMVARHHMKLLCSK 
AFSTQLQQKIFLHSQMGIHHQSVCMKLKPNTSHII 
SILMGQPMALVQLETLAPLTIIIQKFQTQDHMKF 
WKNLPLHSHHLTPSVPQTVIPKKTGSPEIKLKITK 
TIQNGRELFESSLCGDLLNEVQASE\Q*NQSIESRK 
EKRKKSNKHDSSRSEERKSHKIPKLEPEEQNRPN 
ERVDTVSEKPREEPVLKEGSPSSANTIFCSNNGSV 
HWKFQVGDLVWSKVGTYPWWPCMVSSDPQL 
. EVHTKINTRG AREYHVQFFSNQPERA WVHEKRV 
REYKGHKQYEELLAEATKQASNHSEKQKIRKPR 
PQRERAQWDIGIAHAEKALKMTREERIEQYTFIYI 
DKQPEEALSQAKKSVASKTEVKKTRRPRSVLNT 
QPEQTNAGEVASSLSSTEIRRHSQRRHTSAEEEEP 
PPVKIAWKTAAARKSLPASITMHKGSLDLQKCN 
mspvvkieqwalqnatgdgkfidq 
InFKTEISVRGQDRLHSTPNQIWEKPTQSVSSPEATS 
GSTGSVEKKQQRRSIRTRSESEKSTEVWl^KKIK 
KEQVETVPQATVKTGLQKGSADRGVQGSVRFSD 
SSVSAAIEETVD 


2962 


A 


2408 


836 


SASPPPPPPPPPSRFPFSGAPGARDRSGPLGSEPQR 

NPGARPRTLEATVTPPGSVGAMSSSGLNSEKVA 

ALIQKLNSDPQFVI^QNVGTTHDLLDICLICRATV 

QRAQHWQl^VPQEGKPITNQKSSGRCWIFSCLN 

VMRLPFN4KKLNIEEFEFSQSYLFFWDKVERCYFF 

LSAFVDTAQRKEPEDGRLVQFLLMNPANDGGQ 

WDMLVKIVEKYGVIPKKCFPESYTTEATRRMND . 

H.NHKMREFCIRLRKLVHSGATKGEISATQDVM 

MEEIFRVVCICLGNPPETFTWE\ r RDKDKNNKKIG 

PVITPLEFNR/EQHVKPLFNMEDKICLVNDPRPQH 

KYNKLYTV\EYL\SNMYWRGEKLF^ / NNQPIDFLK 

KMVAASIKDGXEAy WFGCDVGKHrANSKLGVLSD 

MNLYDrlELVFGVSLlOMNK^ 

HTMTFTAV/SQSRDDSGMVLFTKW\RVGEFQWG 

PnHGTOrTVT PMTn* vn^i FWVFWA/wnpvw 

I2iLJn\Jxx VTv\J I Lt\^yVx I \J V VJOJ-r-C I V I EV V / V W JJISJSoT. 

VP\EEVLAVLGAGNPFVLPAWDPMGALAE 


2963 


A 


90 


543. 


RHYDSAGKITLKIAKbTn.EQRAVGGASPRLAQS 
VLTCSREPILENSLTSLIEYLHNALEHDMRLRFNN 
DRMKTTIKETST*LSNS YLVFPLM* SLTYLMKMS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, FHPhcny (alanine, G^Glycine, H^Histidine, 
I-Isoleucinc, K«Lysine, ^Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V«VaIine, ^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










FERCTARNKMFVNSPFTKVDNYCT\SS\WKKPYL 
KCYFSLNTIKKEKKMT 


2964 


A 


3 


2454 


FDTYRGLPSISNGNYSQLQFQAREYSGAPYSQRIS 

AITTVSVAWKVLSGKIGEGAEGNCKCVISEGAW 

AVCPTQPCGKAKPDKHLKDLLSKLLNSGYFESIP 

VPKNAKEKEVPLEEEMLIQSEKKTQLSKTESVKE 

SESLMEFAQPEIQPQEFLNRRYMTEVDYSNKQGE 

EQPWEADYARKPNLPKRWDMLTEPDGQEKKQE 

SFKSWEASGKHQEVSKPAVSLEQRKQDTSKLRS 

TLPEEQKKQEISKSKPSPSQWKQDTPKSKAGYVQ 

EEHKKQETPKLWPVQLQKEQDPKKQTPKSWTPS 

MQSEQNTTKSWTTPMCEEQDSKQPETPKSWENN 

VESQKHSLTSQSQISPKSWGVATASLIPNDQLLPR 

KLNTEPKDVP/IACASA*GFLPLQPPFRRI/HVLRK 

EKLQDLMTQIQGTCNFMQESVLDFDKPSSAIPTS 

QPPSATPG*PRRHLKEQNLS\VKVIFFQGAVTVVF 

NVNAPLPPRKEQEIKESPYSPGYNQSFTTASTQTP 

PQCQLPSIHVEQTVHSQETANYHPDGTIQVSNGS 

LAFYPAQTKVFPRPTQPFVNSRGSVRGCTRGGRL 

ITNSYRSPGGYKGFDTYRGLPSrSNGNYSQLQFQ 

AREYSGAP YSQRDNFQQC YKRGGTSGGPRANSJR 

AGWSDSSQVSSPERDNETFNSGDSGQGDSRSMT 

Ty\ JT\\ TTVl fT'MTl A A TTT T>\/TTA fV/TJT Tl/^/'Mi JTD \ 7 A TO A All 

rVJJ VPV 1 NrAAl LLP VHV YrLrXJQMKVAr SAAR 

TSNLAPGTLDQPIVFDLLLNNLGETFDLQLGRFN 

CPWGTYVFIFrlMLKLAVNVPLYVNLMKNEEVL 

VSAYANDGAPDHETASNHAILQLFQGDQIWLRL 

HRGAIYGSSW 


2965 


A 


3 


2454 . 


FDTYRGLPSISNGNYSQLQFQAREYSGAPYSQRIS 

AITTVSVAWKVLSGKIGEGAEGNCKCVISEGAW 

AVCPTQPCGKAKPDKHLKDLLSKLLNSGYFESIP' 

VPKNAKEKEVPLEEEMLIQSEKKTQLSKTESVKE 

SESLMEFAQPEIQPQEFLNRRYMTEVDYSNKQGE 

EQPWEADYARKPNLPKRWDMLTEPDGQEKKQE 

SFKS-WEASGKHQEVSKPAVSLEQRKQDTSKLRS 

TLPEEQKKQEISKSKPSPSQWKQDTPKSKAGYVQ 

EEHKKQETPKLWFVQLQKEQDPKKQTPKSWTPS 

MQSEQNTTKSWTTPMCEEQDSKQPETPKSWENN 

VESQKHSLTSQSQISPKSWGVATASLIPNDQLLPR 

KLNTEPKD VP/IAC ASA* GFLPLQPPFRRI/HVLRK 

EKLQDLMTQIQGTCNFMQESVLDFDKPSSAIPTS 

QPPSATPG*PRRHLKEQNLS\VKVEFFQGAVRVF 

NVNAPLPPRKEQEIKESPYSPGYNQSFITASTQTP 

PQCQLPSIHVEQTVHSQETANYHPDGTIQVSNGS 

LAFYPAQTNVFPRPTQPFVNSRGSVRGCTRGGRL 

ITNSYRSPGGYKGFDTYRGLPSISNGNYSQLQFQ 

AREYSGAPYSQRDNFQQCYKRGGTSGGPRANSR . 

AGWSDSSQVSSPERDNETFNSGDSGQGDSRSMT 

PVDVPV INrAA riLPVHVYrLPQQMRVArSAAR 

TSNLAPGTLDQPIVFDLLLNNLGETFDLQLGRFN 

CPVNGTYVFIFHMLKLAV>JVPLYVNLMKNEEVL 

VSA Y ANDGAPDHETASNH AILOLFOGDOIWLRL 

HRGAIYGSSW 


2966 


A 


1693 


227 


DYVLTAELHRQRSPGVSFGLSVFNLMNAIMGSGI 

LGLAYVMANTGVFGFSFLLLTVALLASYSVHLL 

LSMCIQTAYLGP*TNYFMVLPAH*LTCLPLIEFLQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
. location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A~Alanine C=Cysteine, D=Aspartic Acid, 
E«=GIutamic Acid, ^Phenylalanine, G=Glycine, HHFIistidine, 
l=Isoleucine, KHLysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q^GIutamine, R-Arginine, S^Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine» 
X«Unknown, *=Stop codon, /^possible nucleotide deletion, ' 
^possible nucleotide insertion 










SL*NSL\*AVTSYEDLGLFAFGLPGKLWAGTiriQ 

MGAMSSYLLIIKTELPAA1AEFLTGDYSRYWYLD 

GQTLLIIICVGIVFPLALLPKIGFLGYTSSLSFFFM 

AffFALVVIIKKWSIPCPLTLNYVEKGFQISNVTl)D 

CKPKLFHFSKESAYALPTMAFSFLCHTSILPIYCE 

LQSPSKKRMQNVTNTAIALSFLIYTISALFGYLTF , 

YD/GTTKAQRGEVTCHRJKDKVESELLKG***IP* 

orUD V V VM 1 \ V KLCLLr A VJLLA I VFLlHr PARKA VT 

MMFFSOTPFSWIRHFLITLALNIIIVLLAIYVPDIRN 

VFGWGASTSTCLIFIFPGLFYLKLSREDFLSWKK 

LGVGCFC/LLSFKTSILRNSLSVYIILPASRKSIYFK 

I - 


2967 


A 


3 


3222 


SGIVVRALWREKKPGGGRRVKRRNPGRQAVGH 

TEEDPPRVGTPWKEHTGPGPQEGS1MEAAHAKT 

TEECLAYFGVSETTGLTPDQVKRNLEKYGLNELP 

AEEGKTLWELVDEQFEDLLVRILLLAACISFVLA 

WFEEGEETITAFVEPFVILLIL1ANA1VGVWQERN 

AENAJEALKEYEPEMGKVYRADRXSVQRIKARD 

WPGDIVEVAVGDKVPADIRILAIKSTTLRVDQSIL 

TGEYVSVIKHTEPVPDPRAVNQDIGCNMLFSGTNI . 

AAGKALGIVATTGVGTEIGKIRBQMAATEQDKT 

PLQQKLDEFGEQLSKVISLICVAVWLINIGHFNDP 

VHGGSWFRGAIYYFKIAVALAVAAIPEGLPAVIT 

TCLALGTRKMAKKNAIVRSLPSVETLGCTSVICS 

DKTGTLTTNQMSVCKMFIIDKVDGDICLLNEFSIT 

GSTYAPEGEVLKNDKPVRPGQYDGLVELATICA 

LCNDSSLDFNEAKGVYEKVGEATETALTTLVEK 

MNVFNTDVRSLSKVERANACNSVIRQLMKKEFT 

LEFSRDRKSMSVYCSPAKSSRAAVGNKMFVKGA 

PEGVD3RCNYVRVGTTRVPLTGPVICEKIMAVIKE'. 

WGTGRDTLRCLALATRDTPPKREEMVLDDSARF 

LEYETDLTFVGVVGMLDPPRKEVTGSIQLCRDA 

GIRVIMITGDNKGTAIAICRRIGIFGENEEVADRA 

Y\TGREFDDL\PLAEQ\REACRRACCFARVEPSHK 

SKIX^YLQSYDEITAMTGDGVNDAPALKICAEIGI 

AMGSGTAVAKTASEMVLADDNFSTIVAAVEEGR . 

AIYNNMKQFIRYLISSNVGEVVCIFLTAALGLPEA 

LIPVQLLWVNLVTDGLPATALGFNPPDLDIMDRP 

PRSPKEPLRSGWLFFRYMAIGGYVGAATVGAAA 

W WJr-Li AliJJOrrlVIN JlrMv^UlbDiN lilrbOl 

DCEVFEAPEPMTMALSVLVTIEMCNALNSLSEN 
QSLLRMPPWWIWLLGSICLSMSLHFLILYVDPLP , 
MIFKLRALDLTQWLMVLKISLPVIGLDEILKFVA 
RNYLEG*LFPLLHL*ARVTDPEDERRK 


2968 


A 


3 


2414 


GARSCSRLGRCTFPLWKGREMEVRKLSISWQFLI 

VLVLELQILSALDFDPYRVLGVSRTASQADIKBCA 

^^KKLAREWHPDKNKDPGAEDICFIQISKAYEILSN 

EEKRSNYDQYGDAGENQGYQKQQQQREYRFRH 

FHENFYFDESFFHFPFNSERRDSEDEKYLLHFSHY 

VNEVAPDSFKKPYLIKITSDWCFSCrmEPVWKEV 

TOFT FFT n Vmn WU A n VPT? P I AT-TPT flAT-I^TPQT 

LGHNGKISFFHNAVVRENLRQFVESLLPGNLVEK 
VTNKNYVRFLSGWQQENKPHVLLFDQTPIVPLL 
YKLTAFAYKDYLSFGYVYVGLRGTEEMTORYM 
NIYAPTLLWKEHINRPADVIQARGMKICQIIDDFI 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine 0=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenyIalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K^Lysine, L=Leucine, M=Mcthionine, 
N=Asparagine, P=Proiine, Q=Glutamine, R=Arginine, S=Serine, 
T^Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, ^possible nucleotide deletion, 
\= possible nucleotide insertion 










TRNKYLLAARLTSQKLFHELCPVKJISHRQRKYC 

VVLLTAETTKLSKPFEAFLSFALANTQDTVRFVH 

VYSNRQQEFADTLLPDSEAFQGKSAVSILERRNT 

AGRWYKTLEDPWIGSESDKFILLGYLDQLRKDP 

ALLSSEA VLPDLTDELAP VFLLRWFYS ASDYI SD 

CWDSIFH>^W\REMMPLLSLIFSALFILFGTVIVQ 

AFSDSNDERESSPPEKEEAQEKTGKTEPSFTKENS 

SKIPKKGFVEVTELTDVTYTSNLVRLRPGHMNV 

VLILoNb 1 K 1 c>LL(jKr ALJb V Y 1 r I CjoaCJLHr Sr LSL 

DKHREWLEYLLEFAQDAAPIPNQYDKHFMERDY 

TGYVLALNGHKKYFCLFKPQKTVEEGGKP*GSC 

SDVDSSLYLGESRGKPSCGLGSRPIKGKLSKLSL 

WMERLLEGSLQRFYIPS WPELD 


2969 


A 


48 


1117 


KGLSPDQVLSAFAPLDCEMWLKVFTTFLSFATG 

ACSGLKVTVPSHTVHGVRGQALYLPVHYGFHTP 

ASDIQITWLFERPHTMPKYLLGSVNKSVVPD/Y GI 

P/YTSSP*CHPMASLLINPLQFPDEGNYIVKVNIQG 

NGTLSASQKIQVTVDDPVTKPVVQIHPPSGAVEY 

VGNMTLTCHVEGGTRLAYQWLKNGRPVHTSST 

YSFSPQNNTLHIAPVTKEDIGNYSCLVRJvrPVSEM 

ESDIIMPIIYYGPYGLQVNSDKGLKVGEVFTVDL 

GEAILFDCSADSHPPNTYSWIRRTDNTTYIIKHGP 

RLEVASEKVAQKTMDYVCCAYNNITGRQDETHF 

TVnTSVGMCDIQGRDPNKT 


2970 . 


A 


68 


936 . 


HSALLTHSSFCVFTLCQDFFTYSSMSEEVTYADL . 

QFQNSSEMEKIPEIGKFGEKAPPAPSHVWRPAAL 

FLTLLCLLLLIGLGVLASMFHVTLKIEMKKMNKL 

QNISEELQRNISLQLMSNMNISNKIRNLSTTLQTI 

ATKLCRELYSKEQEHKCKPCPRRWrWHKD 

LSDDVQTWQESKMACAAQNASLLKINNKNALE 

FIKSQSRSYDYWLGLSPEEDS/YSWYESG*YNQ\P 

SAWVIRNAPDLNNMYGGYINRLYVQYYHCTYK 

QRMICEKMANPVQLGSTYFREA 


2971 


A 


912 


2287 


VPNYLPSVSSAIGGEVPQRYVWRFCIGLHSAPRF 
LVAFAYWNHYLSCTSPCSCYRPLCRLNFGLNVV 
ENLALLVLTYVSSSEDF/TWVPG*GRSGEVFPEGT 
GLPLPHSDLPTSWCGHSLQCGSQSSFPPAIHENAF 
WF^SSLGHMLLTCILWRLTKKHTVSQE\DGLSL 
AGAPRQPRRKSRTSVLRIRVMVRWELSSNGNPG 
RGVLGLGLGLGNKLRVVGQNLGL*HCVWVVWE 
TGE*KRWRLQMGIE* G V ASRRQ* VRNS VRGLVC 
HNSSAPPMYMGFFSPTVFGGGVGG*LHVTFILHP . 
rbVbAAUlFLLLOPSLPQRQGJ^HIVVU^AAPACA 
PFHDR*WEPREIRPSP*ELGLRGEPTLSYPASCRVT 
RQPIP*DRKSYSWKQRLFIINFISFFSALAVYFRHN 
MYCEAGVYTIFAILEYTVVLTKN1AFHMTAWWD 
. FGNKELLITSQPEEKRF 


2972 


A 


1734 


246 


GGILSGRDGRTALPRPREPAERTAGLRRDMRPQE 

LPRLAFPLLLLLLLLLPPPPCPAHSATRFDPTWES 

LDARQLPAWFDQAKFGIFIHWGVFSVPSFGSEWF 

W W i Wl^isJiiSJLrJV i V lir JYUSJUIN I rforJv i cJJrO.T\L 

FTAKFFNANQ\WADEFQASGAKYIVLTSKHHEGF 

TLWG\SEYSWNWNAIDEGPKRDIVKELEVAIRNR 

TDUU^GLYYSLFEWFHPLFLEDESSSFHKRQFPVS 

KTLPELYELVNNfYQPEVLWSDGDGGAPDQYWN 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E-GIutamic Acid, F=Phcny lain nine, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Prolinc, Q^GIutamine, R=Argininc, S=Serine, 
T=Threonine, V=Valine, W«=Tryptophan, Y^Tyrosine, 
X=Unknown t *~Stop codon, A=possible nucleotide deletion, 
\=possible nucleotide insertion 










STGFLAWLYNESPVRGTVVTNDRWGAGSICKHG 

GFYTCSDRYNPGHLLPHKWENCMTEDKLSWGY 

RREAGISDYLTEEELVKQLVETVSCGGNLLMNIG 

PTLDGT1SVVFEERLRQMGSWLKVNGEAIYETHT 

WRSQNDTVTPDVWYTSKPKEKLVYAIFLKWPTS 

GQLFLGHPKAILGATEVKLLGHGQPLNWISLEQK 

GIMVELPQLTIHQWCKWGWALALTOVI 


'2973 


A 


24 

>■ ■ . ■ 


1133 


SVPRAGGDMETGAAELYDQALLGILQHVGNVQ 

DFLRVLFGFLYRKTDFYRLLRHPSDRMGFPPGAA 

QALVLQVFKTFDHMARQDDEKRRQELEEKIRRK 

EEEEAKTVSAAAAEKJEiPVPVPVQEIEIDSTTELDG 

HQEVEKVQPPGPVKEMAHGSQEAEAPGAVAGA 

AEVPR\EPPELPRIQEQFQBCNPDSYNGAVRENYTW 

SQDYTDLEVRVPWKHVVKGKQVSVALSSSSIRV 

AMLEENGERVLMEGKLIHKINTESSLWSLEPGK 

CVLVNLSKVGEYWWNADLEGEEPIDIDKINKERS 

MATVDEEEQAVLDRLTFDYHQKLQGKPQSHEL 

KVHEMLKKGWDAEGSPFRGQRFDPAMFN1SPGA 

VQF 


2974 


A 


271 


.1854 , 


MQFGRAHGDCVSGAQLCGCPSMDDYMVLRMIG 
EGSFGRALLVQHESSNQMFAMKEIRLPKSFSNTQ 
NSRKEAVLLAKMKHPNIVAFKESFEAEGHLYIV 
MEYGDGGDLMQKIKQQKGKLFPEDMILNWFTQ 
MCLGVNH1HKKRVLHRDIKSKNIFLTQNGKGKL 
. GDFGSARLLSNPMAFACTYV GTPYYVPPEI WEN 
LPYNNKSDIWSLGCELYELCTLKHPFQANS WKNL 
ILKVCQGCISPLPSHYSYELQFLVKQMFKRNPSH 
RPSATTLLSRGIVARLVQKCLPPEIIMEYGEEVLE 
E1KNSKHNTPRKKTNPSRIRIALGNEASTVQEEEQ 
DRKGSHTDLESINENLVESALRRVNREEKGNKSV 
HLRXASSPNLHRRQ WEKNVPNTALTALENA SILT 
SSLTAEDDRGGSVIKYSKNTTRKQWLKETPDTLL 
NILKNADLSLAFQTYTIYRPGS\EGFLKGPLSEETE 
ASDSVDGGHDSVILDPERLEPGLDEEDTDFEEED 
DNPDWVSELKKRAGWQGLCDR 


2975 


A 


32 


.2833 


PPGEPGAGRGALSPCGPLSGPPPLPGREAGGTCG 

QPVNPVFDLSRRNPQEDFELIQR1GSGTYGDVYK 

ARNVNTGELAAIKVIKLEPGEDFAWQQEIIMMK 

D\CKHP\DIV A YF\GS YL\RRDKL WI\CMEF\CGS G S 

ALQDIYHVTGPLSELQIAYVSRETLQGLYYLHSKG ' 

KMHRDIKGANILLTDNGHVKLADFGVSAQITATI 

AKRKSFIGTPYWMAPEVAAVERKGGYNQLCDL 

WAVGITAIELAELQPPMFDLHPMRALFLMTKSNF 

QPPKLKDKMKWSNSFHHFVKMALTKNPKKRPT 

AEKLLQHPFVTQHLTRSLAIELLDKVNNPDHSTY 

HDFDDDDPEPLVAVPHRIHSTSRNVREEKTRSEIT 

FGQVKFDPPLRKETEPHHELPDSDGFLDSSEEIYY 

TARSNLDLQLEYGQGHQG\GYFLGAN1CSLLKSV 

EEELHQRGHVAHLEDDEGDDDESKHSTLKAK1P 

PPLPPKPKSEFIPQEMHSTEDENQGTIKRCPMSGSP 

\ a T/nc at rT>T»r> "nT>T»T>Tj t *nT>"LTT/"D\ / a T PVT/^H/TCcrAT xir 1 

\AKPSQ VPPKrPPPKLPPrlKr V AIAjin u M b Q LN u 

ERDGSLCQQQNEHRGENLSRKEKKDVPKPISNG 

LPPTPKVHMGACFSKVFNGCPLKJHCASSWTNPD 

TRDQYLIFGAEEGIYTLNLNELHETSMEQLFPRR 

CTWLYVMNNCLLSISGKASQLYSHNLPGLFDYA* 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to First amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (ArAlanine OCystcine, 0«Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycint, H<=Histidine, 
J-Isoleucine, K-Lysine, L=Leucine, M=M ctlii on inc, 
N=Asparagine, P^Proline, Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V=Valine, W*=Tryptophan, Y=Tyrosine, 
X^Unknown, * s =Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










RQMQKLPVAffAHKLPDRILPRKFSVSAKIPEIK 

WCQKCCVVRNPYTGHKYLCGALQTSIVLLEW 

EPMQKFMLIKMDFPIPCPLKMFEMLVVPEQEYP 

LVCVGVSRGRDFNQVVRFETVWNSTSSW^ 

DTPQTNVTHVTQLERDTILVCLDCCnOVNLQGR 

LKSSRKLSSELTFDFRIESIVCLQDSVLAFWKHG 

MQGRSFRSNEVTQEISDSTRIFRLLGSDRVVVLES 

RPTDNPTANSNLYILAGHENSY 


2976 


A 


32 


2833 


PPGEPGAGRGALSPCGPLSGPPPLPGREAGGTCG 

QPVNPVFDLSRRNPQEDFELIQR1GSGTYGDVYK 

ARNVNTGELAAIKVIKLEPGEDFAVVQQEIIMMK 

D\CKHP\DIVAYF\GSYL\RRDKLWI\CMEF\CGSGS 

\LQDIYHVTGPLSELQ1AYVSRETLQGLYYLHSKG 

KMHRDIKGANILLTDNGHVKLADFGVSAQITATI 

AKRKSFIGTPYWMAPEVAAVERKGGYNQLCDL 

WAVGITAIELAELQPPMFDLHPMRALFLMTKSNF 

QPPKLKDKMKWSNSFHHFVKMALTKNPKKRPT 

AEKLLQHPFVTQHLTRSLAIELLDKVNNPDHSTY 

HDFDDDDPEPLVAVPHRIHSTSRNVREEKTRSEIT 

FGQVKFDPPLRKETEPHHELPDSDGFLDSSEEIYY 

TARSNLDLQLEYGQGHQG\GYFLGANKSLLKSV 

EEELHQRGHVAHLEDDEGDDDESKHSTLKAKIP 

PPLPPKPKSIFIPQEMHSTEDENQGTIKRCPMSGSP 

\AKPSQVPPRPPPPRLPPHKPVALGNGMSSFQLNG 

ERDGSLCQQQNEHRGENLSRKEKKDVPKPISNG 

LPPTPKVHMGACFSKVFNGGPLKIHCASSWINPD 

TRDQYLIFGAEEGIYTLNLNELHETSMEQLFPRR 

CTWLYVMNNCLLSISGKASQLYSHNLPGLFDYA 

RQMQKLPVAIPAHKLPDRILPRKFSVSAKIPETK 

WCQKCCVVRNPYTGHKYLCGALQTSIVLLEWV 

EPMQKFMLIKHIDFPIPCPLKMFEMLVVPEQEYP 

LVCVG V SRGRDFNQ WRFETVNPNSTS S WFTES 

DTPQTWTHWQLERDTILVCLDCCIKIVNLQGR 

LKSSRKLSSELTFDFRIESIVCLQDSVLAFWKHG 

MQGRSFRSNEVTQEISDSTRIFRLLGSDRVWLES 

RPTDNPTANSNLYILAGHENSY 


2977 


A 


174 


1543 


YSLRKGITFKLAGAMVHDCKGELTQEEKELLEVI 

GKGTVQEAGTLLSSKNVRVNCLDENGMTPLMH 

AAYKGKLDMCKLLLRHGADVNCHQHEHGYTA 

LMFAALSGNKDITWVMLEAGAETDVVNSVGRT 

AAQMAAFVGQHDCVTIINNFFPRERLDYYTKPQ 

GLDKEPI<XPPKLAGPLHKnTTimi-IPViaVMLV 

NENPLLTEEAALNKCYRVMDLICEKCMKQRDM 

NEVLAMKMHYISCIFQKCINFLKDGENKLDTLIK 

SLLKG\RASDGFPVYPEKILRESIRK\FPYCEATLL 

QQL VRSIAP VEIG SDPTAFSVLTQ AITGQVGFVD V 

EFCTTCGEKGASKRCSVCKMVIYCDQTCQKTHW 

FTHKKJCKNLKDIYEKQQLEAAKEKRQEENHGK 

LDVNSNCVNEEQPEAEVGISQKDSNPEDSGEGK 

KESLESEAELEGLQDAPAGPQVSEE 


297JI 


A 


% 
j 


Jilt 


crvnT ptht popa/ot^ acct vi d/ta/vrvt cvktctc 

oUJJJLK 1 Lj.LT K^JJ V V^JJAJloJLJsXii^LJ Vie Vl/P I iNxilE- 

DCPGMMLWRYPEPRGLTLVRITPVPFNTTEDPDI 
STADLGDVLQDPCSLEYWDELQKVFVAFREFNL 
SESKVCELQLPDimVhnDQKia,VSSDLWRIVLNS 
SQNGADDQSSASESGSQSTCDPLVTPTALAACTR 



219 



WO 01/57190 



PCT7US01/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A= Ala nine OCysteinc, D=Aspartic Acid, 
E=GJutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, L=Lcticine, M=Methionine, 
N^Asparagine, P«Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X«=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possibIe nucleotide insertion 










YDSCFTPWFWSLCVSFQFAHLEFHLCHHLDQLG 

TAAPQYLQPFVSDRNMPSELEYMIVSFREPHMYL 

RQWNNGSVCQEIQFLAQADCKLLECRNVTMQS 

VVKPFSIFGQMAVSSDVVEKLLDCTVIVDSVFVN 

LGQHWHSLNTAIQAWQQ^CPEVEELVFSHFV 

ICKTDTQETLRFGQVDTDENILLASLHSHQYSWRS 

HKSPQLLHICIEGWGNWRWSEPFSVDHAGTFIRT 

IQYRGRTASLIIKVQQLNGVQKQIHCGRQIICSYL 

SQSEELKVVQHYIGQDGQAVVREHFDCLTAKQK 

LPSYILENNELTELCVKAKGDEDWSRDVCLESK 

APEYSIVIQVPSSNSSIIYVWCTVLTLEPNSQVQQ 

RMIVFSPLFIMRSHLPDPIIIHLEKRSLGLSETQIIP 

GKGQEKPLQNIEPDLVHHLTFQAREEYDPSDCA 

VPISTSLIKQIATKVHPGGTVNQILDEFYGPEKSL 

QPIWPYNKKDSDRNEQLSQWDSPMRVKLSIWKP 

YVRTLLIELLPWALLINESKWDLWLFEGEKIVLQ ■ 

VPAGKHIPPNFQEAFQIGIY WANTNTVHKS V AIK 

LVHNLTSPKWKDGGNGEVVTLDEEAFVDTEIRL 

GAFPGHQKLCQFCISSMVQQGIQIIQIEDKTTIINN 

TPYQIFYKPQLSVCNPHSGKEYFRVPDSATFSICP 

GGEQPAMKSSSLPCWDLMPDISQSVLDASLLQK' . 

QIMLGFSPAPGADSSQCWSLPAIVRPEFPRQSVA 

VPLGNFRENGFCTRAIVLTYQEHLGVTYLTLSED 

PSPRVIIHNRCPVKMLIKENIKDIPKFEVYCKXIP^ • 

ECSIHHELYHQISSYPDCKTKDLLPSLLLRVEPLD 

EVTTEWSDAIDINSQGTQVVFLTGFGYVYVDVV 

HQCGTVFITVAPEGKAGPILTNTNRAPEKIVTF/K 

MFITQLSLAVFDDLTHHKASAELLRLTLDNIFLC 

VAPGAGPLPGEEPV A ALFEL YC VEICCGDLQLDN 

QLYNKSNFiHDFAVLVCQGEKAEPIQCSKMQSLLlS 

NKELEEYKEKCFIKLCITLKEGKSILCDINEFSFEL 

KPARLYVEDTFVYYIKTLFDTYLPNSRLAGHSTH 

LSGGKQVLPMQVTQHARALVNPVKLRKLVIQPV 

NLLVSIHASLKLYIASDHTPLSFSVFERGPIFTTAR 

QLVHALAMHYAAGALFRAGWVVGSLDILGSPA 

SLVRSIGNGVADFFRLPYEGLTRGPGAFVSGVSR 

GTTSFVKHISKGTLTSITNLATSLAR3SIMDRLSLDE 

EHYNRQEEWRRQLPESLGEGLRQGLSRLGISLLG. 

AIAGIVDQPMQNFQKTSEAQASAGHKAKGVISG 

VGKGIMGVFTKPIGGAAELVSQTGYGILHGAGLS 

QLPKQRHQPSD\VHADQAPNSHVKYVWKMLQS 

t m>Tyc\rLT\A a t ty\/\/t \/t> q r^xzu'c r~*r^i t t TOtJWT 
i^VjKrJfc V rilVLAlj D V V L V KLi b ul^JbribOCLLLI bbVL 

FVVSVSEDTQQQAFPVTEIDCAQDSKQNNLLTV 

QLKQPRVACDVEVDGVRERLSEQQYNRLVDYIT 

KTSCHLAPSCSSMQPCPWAAEPPPSTVKTYHY 

LVDPHFAQVFLSKFTMVKNKALRKGFP 


2979 


A 


255 


2673 


AWLFP ASVLCPRCLTG S A VGSAE WKSLV VLFPFS 
SRPTLGHLDSKPSSKSNMIRGRNSATSADEQPfflG 
NYRLLKTIGKGNFAKVKLARHILTGKEVAVKHD 
KTQLNSSSLQKLFREVRIMKVLNHPNIVRLFEVIE 

TPPfTT VT VMFYA^nnFVFFiVT V A Pfffll \A\CV\CV A 
1 X_*Xv 1 J_/ 1 JL> V lviX> I riovJVJd VrL' I JL* V f\ rlO imVJJxJ_J VJD.TV. 

RAKFRQIVSAVQYCHQKFIVHRDLKAENLLLDA 
DMNDC1ADFGFSNEFTFGNKLDTFCGSPPYAAPEL . 
FQGKKYDGPEVDV WSLG VELYTLVSGSLPFDG Q 
NLKELRERVLRGKYRIPFYMSTDCENLLKKFLIL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCystcine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhcnyI alanine, G==Glycinc, H=Histidine, 
I=lsoleuci»e, K=Lysine, L^Leucine, M^Methionine, 
N=Asparagine f P^ProIine, Q^GIu famine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion . " J 










NPSKRGTLEQIMKDRWMNVGHE\DDELKPYGEP 

LP\DYKDPRRTELMVSMGYTREEIQDSLVGQRYN 

EVMATYLLLGYKSSELEGDTITLKPRPSADLTNS 

SAPSPSHKVQRSVSANPKQRRFSDQAGPAIPTSNS 

YSKKTQSNNAENKRPEEDRESGRKASSTAKVPA 

SPLPGLERKXTTPTPSTNSVLSTSTNRSRNSPLL\E 

RASLVGQGFHPEWAKTALTMPGSRASTASASAA 

VSAARPRQHQKSMSASVHPNKASGLPPTESNCE 

VPRPRQVCWGSCTAPQRVPVASPSAHNISSSGGA 

PDRTNFPRGVSSRSTFHAGQLRQVR\DQQNLPYG 

\/TTU A CDC PUCnr T> T3 /"^ A O/""' O TTC T/ PTO T/ Tl ITt OX IT XTT"* 

V lrAarbOHoC^OKRuAbCiblrSKJr TlbKxVRRNLNE 

PESKDR\VETLRPHW\NSGGNDKEKEEFREAKPR 

SLRiTWSMKTTSSMEPNEMMREIRKVLDANSCQ 

SELHEKYMLLCMHGTPGHEDFVQWEMEVCKLP 

RLSLNGVRFKRISGTSMAFKNIASKIANELKL 


2980 


A 


.120 


3433 


NCLLLQAKGFHGEIEDLQQWLTDTERHLLASKP 

LGGLPETAKEQLNVHMEVCAAFEAKEETYKSLM 

QKGQQMLARCPKSAETNIDQDTNNLKEKWESVE 

TKLNER\KT\KLEEALNLA\MEFHNSL\QDFINWLT 

QAEQTLNVASRPSLILDTVLFQIDEHKVFANEVN 

SHREQUELDKTGTHLKYFSQKQDVVLIKNLLISV 

QSRWEKVVQRLVERGRSLDDARKRAKQFHEAW 

SKLMEWLEESEKSLDSELEIANDPDKIKTQLAQH 

KEFQKSLGAKHSVYDTTNRTGRSLKEKTSLADD 

NLKLDDMLSELRDKWDTICGKSVERQNKLEEA\ 

LLFSGQFTDALQALID WLYRVEPQLAEDQPVHG 

DIDLVMNLIDNHKAFQKELGKRTS S VQALKRS A 

RELIEGSRDDSSWVKVQMQELSTRWETVCALS1S 

KQTRLEAALRQAEEFHSVVHALLEWLAEAEQTL 

RFHGVLPDDEDALRTLIDQHKEFMXKLEEICRAE 

LNKATTMGDTVLAICHPDSITTIKHWITIIRARFEE 

VLAWAKQHQQRLASALAGLIAKQELLEALLAW 

LQWAETTLTDBCDKEVIPQEIEEVKALIAEHQTFM - 

EEMTRKQPDVDKVTKTYKRRAADPSSLQSHIPV 

LDKGRAGRKRFPASSLYPSGSQTQIETKNPRVNL 

LVSKWQQVWLLALERRRKLNDALDRLEELREF 

ANFDFDIWRKKYMRWM>raOCSRVMDFFRRIDK 

DQDGKITRQEFIDGILSSKFPTSRLEMSAVADIFD 

RJDGDGYIDYYEFVAALHPNKDAYKPITDADKIE 

DEVTRQVAKCKCAKRFQVEQIGDNKYRFFLGNQ 

FGDSQQLRLVRILRSTVMVRVGGGWMALDEFL 

VK3TOPCRAKGRTNMELREKFILADGASQGMAA 

FRPRGRJRSRPSSRGASPNRSTSVSSQAAQAASPQ , 

\7U A 'I" ["PDT/'TT T T*T>T 'I'll vn/ri T-/"Tl\1 7T T'\.TOT/"l\ /PTTir'T/ A A 

VJfAl 1 1 risJLilrL 1 KM Y UKr WL IN bKMS i r CKAA 
ECSDFPVPSAEGTPIQGSKLRLPGYLSGKGFHSGE 
DSGLITTAAARVRTQFADSKKTPSRPGSRAGSKA 
GSRASSRRGSDASDFDISEIQSVCSDVETVPQTHR 
PTPRAGSRPSTAKPSKIPTPQRKSPASKLDKSSKR 


2981 


A 


120 


3433 


NCLLLQAKGFHGEIEDLQQWLTDTERHLLASKP 
LGGLPETAKEQLNVHMEVCAAFEAKEETYKSLM 

TKLNER\KT\KLEEALM-A\MEFHNSL\QDFINWLT 
QAEQTLNVASRPSLELDTVLFQIDEHICVFANEVN 
SHREQHELDKTGTHLKYFSQKQDVVLIKNLLISV 
QSRWEKVVQRLVERGRSLDDARKRAKQFHEAW 



221 



WO 01/57190 



PCT/USO 1/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid rtsiduc of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, I>Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^Serine, 
T^Threonine, V^Valine, W«Tryptophan, V^Tyrosine, 
X=l)n known, *=*Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










SKLMEWLEESEKSLDSELEIANDPDKIKTQLAQH 

KEFQKSLGAKHSVYDTTNRTGRSLKEKTSLADD 

NLKLDDMLSELRDKWDTICGKSVERQNKLEEA\ 

LLFSGQFTDALQALIDWLYRVEPQLAEDQPVHG 

DIDLVMNLIDNHKAFQKELGKRTSSVQALKRSA ' 

RELffiGSRDDSSWVKVQMQELSTRWETVCALSIS 

KQTRLEAALRQAEEFHSWHALLEWLAEAEQTL . 

RFHGVLPDDEDALRTLIDQHKEFMKKLEEKRAE 

LNKATTMGDTVLAICHPDSITTIKHWITnRARFEE 

VLAWAKQHQQRLASALAGLIAKQELLEALLAW 

LQWAETTLTDKDKEVEPQEIEEVKALIAEHQTFM 

EEMTRKQPDVDKVTKTYKRRAADPSSLQSHIPV 

LDKGRAGRKRFPASSLYPSGSQTQIETKNPRVNL 

LVSKWQQVWLLALERRRKLNDALDRLEELREF 

ANFDFDIWRKKYMR\\TV1NHKKSRVMDFFRRIDK 

DQDGKJTRQEFIDGILSSKFPTSRLEMSAVADIFD 

RDGDGYIDYYEFVAALHPNKDAYKPITDADKIE 

DEVTRQVAKGKCAKRFQVEQIGDNKYRPFLGNQ 

FGDSQQLRLVRILRSTVMVRVGGGWMALDEFL 

VKNDPCRAKGRTNMELREKFILADGASQGMAA 

FRPRGRRSRPSSRGASPNRSTSVSSQAAQAASPQ 

VFA I rrrKlLHPLTKNYGKPWLTNS 

ECSDFPVPSAEGTPIQGSKLRLPGYLSGKGFHSGE 

DSGLITTAAARVRTQFADSKKTPSRPGSRAGSKA 

GSRASSRRGSDASDFDISEIQSVCSDVETVPQTHR 

PTPRAGSRPSTAKPSKIPTPQRKSPASKLDKSSKR 


2982 


A : 


1 


2065 


MAAGGAEGGSGPGAAMGDGAEIKSQFRTREGF 

YKLLPGDGAARRSGPASAQTPVPPQPPQPPPGPA 

SASGPGAAGPASSPPPAGPGPGPALPAVRLSLVR 

LGEPDSAGAGEPPATPAGLGSGGDRVCFNLGRE 

LYFYPGCCRRG SQRWHTPLTPFLPPLKSIDLNKPI 

DKRJYKGTQPTCHDFNQFTAATETISLLVGFSAG . 

QVQYLDLIKKDTSKLFNEERLIDKTKVTYLKWLP 

ESESLFLASFIASGHLYLYNfVSHPCASAPPQYSLL 

KQVAWGFSl^AAKSKAPRNPLAKWAVGEGPLNE 

FAFSPDGRHLACVSQDGCLRVFHFDSNELLRGLM 

KSYFGGLLCVCWSPDGRYWTGGEDDLVTVWS 

FTEGRVVARGHGHKSWVNAVAFDPYTTRAEEA 

ATAAGADGERSGEEEEEEPEAAGTGSAGGAPLSP 

LPKAGSITYRFGSAGQDTQFCLWDLTEDVLYPHP 

PLARTRTLPGTPGTTPPAASSSRGGEPGPGPLPRS 

LSRSNSLPHPAGGGKAGGPGVAAEPGTPFSIGRF 

ATT TT /^"CD T> T\T") A "C , Y>>"T7T TTATJ \/T TOT p\7TCnPnpr>r< 

A l L 1 LQbKJKJDRu AbK£HKJRYHSLGMSRGGSGG 
SGSGGEKPSGPVPRSRLDPAKVLGTALCPRIHEV . 
PLLEPLVCKKIAQERLTVLLFLEDCHTACQEGLIC 
TWARPGKAFTDEETEAQTGEGSWPRSPSKSVVE 
GISSQPGNSPSGTVV 


2983 


A 


3855 


220 


RRFRLSAHRAQPCCRCRGLEMPRGVFQQLSNLV 
LQELNANLSNLTSAFEKATAEKIKCQQEADATN 
RVILLANRLVGGLASENIRWAESVENFRSQGVTL 

V-fVJL^ V JL»i-»XOrtr Vol V U I r I JSJV I IviN Jui^lYlCINJ Wlx I 1 

HNLKWIPITKGLDPLSLLTDDADVATWNNQGLP 
SDRMSTEN ATELGNTERWPLIVDAQLQGIKWIICN 
KYRSELKAIRLGQKSYLDV1EQATSEGDTLLIENI 
GETVDPALDPLLGRKTIKKGKYIIGGDKEVGVPP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino / 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E^GIutamic Acid, F-Phenylalanine, G-Glycinc, H=Histidine, 
I~Isoleucinc, K=*Lysine, L?=Leucinc, M=Methionine, 
N=Asparagine, P=Proline, Q=Glu famine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, W<=Tryptophan, Y=Tyrosine, 
X=Unknown, *«Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










Q VPPDPTHQ VLQPTLQ ARD AG S VHXLINFLVTRD 

GLEDQLLAAVVAKERPDLEQLKANLTKSQNEFK 

IVLKELEDSLLARLS AA SGNFLGDTALVENLETT 

KHT A SEIEEK V VE AKITE VKINE AREhTVT^P AAER 

ASLLYFILNDLNKINPVYQFSLKAFNVVFEKAIQR 

TTPANEVKQRVINLTDEITYSVYMYTARGLFERD 

KLIFLAQVTFQVLSMKKELNPVELDFLLRPPFKA 

GVVSPVDFLQHQGWGGIKALSEMDEFKNLDSDI . 

EGSAKRWKKLVESEAPEKEEFPKEWKNKTALQK 

LCMVRCLRPDRMTYAIKNFVEEKMGSKFVEGRS 

VEFSKSYEESSPSTSIFFILSPGVDPLKDVEALGKK 

LGFTIDNGKLHNVSLGQGQEWAENALDVAAEK 

GHWVILQNIHLVARWLGTLDKKLERYSTGRHED 

YRVFIRAEPAPSPETHIIPQGILENAIKITNEPPTGM 

YANLYKALDLFTQDTLEMCTKEMEFKCMLFAL 

CYFHAVVAERRKFGAQGWNRSYPFNNGDLTISI 

NVLYNYLEANPKVPWDDLRYLFGEIMYGGmTD 

DWDRRLCRTYLAEYIRTEMLEGDVLLAPGFQPP 

NLDYKG YHEYIDENLPPESPYL YGLHPNAEIGFL 

TVTSEKLFRTVLEMQPKETDSGAGTGVSREEKV 

KAVLDDILEKIPETrTSfMAEIMAKAAEKTPYVVV 

AFQECERMNCLTNEMRRSLKELNLGLKGELTITT 

DVEDLSTALFYDTVPDTWVARAYPSMMGLAAW 

YAl^LLRJRELEAWTTDFALPTTVWLAGFFlSrPQS 

FLTAIMQS]VLARKNEWPLDKMCLSVEVTKKKRE 

DMTAPPREGSYVYGLFMEGARWDTQTGVIAEA 

RLKELTPAMPVIFIKAIPVARMETKNIYECPVYKT 

RIRGPTYVWTFNLKTKEKAAKWILAAVALLLQV 


2984 


A . 


2 


1464 


FVLFPGIAMETPGASASSLLLPAASRPPRKREAGE 

AGAATSKQRVLDEEEYIEGLQTVIQRDFFPDVEK 

LQAQKEYLEAEENGDLERMRQIAIKFGSALGKM 

SREPPPPYVTPATFETPEVHAGTGVVGNKPRPRG 

RGLEDGEAGEEEEKEPLPSLDVFLSRYTSEDNAS 

FQEIMEVAKERSRARHAWLYQAEEEFEKRQKDN 

LELPSAiEHQAIESSQASVETWKYKAKNSLMYYP 

EGWDEEQLFKKPRQWHKKTRFLRDPFSQALSR 

CQLQQAAALNAQHKQGKVGPDGKELEPQESPRV 

GGFGFVATPSPAPGV^SPMMTWGEVENTPLRV 

EGSETPYVDRTPGPAFKILEPGRRERLGLKMANE 

AAAKNRAKKQEALRRVTENLASLTPKGLSPAMS 

PALQRLVSRTASKYTDRALRASYTPSPARSTHLK 

NPGPVGCRPPQSTPGA/PGSATRTPL\TQDPA\SIT 

DNLLQLPARRKASDFF 


2985 


A 


1890 


178 


ASTQEAGLLSPPGVGAQRCWNFVACLPVRACAD 

MASNDYTQQATQSYGAYPTQPGQGYSQQSSQP . 

YGQQSYSGYSQSTDTSGYGQSSYSSYGQSQNSY 

GTQSTPQGYGSTGGYGSSQSSQSSYGQQSSYPGY 

GQQPAPSSTSGSYGSSSQSSSYGQPQSGSYSQQPS 

YGGQQQSYGQQQSYNPPRGYGQQNQYNSSSGG 

GGGGGGGGSYGQDQSSMSGSGGGGGGGGGGGS 
rjnnnnvnxrnnnTn a a n*;p nvp n\rvnp nnp pp n ■ 

GSGGGGSVGGAAGYNRSSGGYEPRGRGGGRGGR 
GGMGGSDRGGFNKFGGPRDQGSRHDSEQDNSD 
NNTIFVQGLGENVTffiSVADWKQIGIIKTNKKTG 
QPMINLYTDRETGKLKGEATVSFDDPPSAKAAID 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine OCysteine, D=Aspartic Acid, 
E=Glutatnic Acid, F=Phenyla!anine, G=Glycine, H=Histidine, 
I=Isoleucine, K«Lysine, LHLeucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Giutamine, R=Arginine, S=Serine, 
T^Threonine, V=Valinc, W=Tryptophan, Y=Tyrosine, 
X«Unknown, *=Stop codon, /-possible nucleotide deletion, 
\=possib)c nucleotide insertion 










WFDGKEFSGNPIKVSFATRRADFNRGGGNGRGG 

tsSJS\\3\Jr JVIVJJKAjO I UuuuouuuuKuurroUuuuu 

GGQQRAGDWKCPNPTCENMNFSWRNECNQCK 
APKPDGPGGGPGGSHMGGNYGDDRRGGRGGYD 
RGGYRGRGGDRGGFRGGRGGGDRGGFGPGKM 
DSRGEHRQDRRERPY 


2986 


A 


1890 


178 


ASTQEAGLLSPPGVGAQRCWNFVACLPVRACAD 

MASNDYTQQATQSYGAYPTQPGQGYSQQSSQP 

YGQQSYSGYSQSTDTSGYGQSSYSSYGQSQNSY 

GTQSTPQGYGSTGGYGSSQSSQSSYGQQSSYPGY 

GQQPAPSSTSGSYGSSSQSSSYGQPQSGSYSQQPS 

YGGQQQSYGQQQSYNPPRGYGQQNQYNSSSGG 

GGGGGGGGSYGQDQSSMSGSGGGGGGGGGGGS 

GGGGGYGNQDQTGAAGSRGYRQ\QDRGGRCRG 

GSGGGGS\GGAAGYNRSSGGYEPRGRGGGRGGR 

GGMGGSDRGGFNKFGGPRDQGSRHDSEQDNSD 

NNTIFVQGLGENVT1ESVADYFKQIGIIKTNKKTG ' 

QPMNLYTDRETGKLKGEATVSFDDPPSAKAAID 

WFDGKEFSGNPIKVSFATRRADFNRGGGNGRGG 

GGQQRAGDWKCPNPTCENMNFSWRNECNQCK 
APKPDGPGGGPGGSHMGGNYGDDRRGGRGG YD 
RGGYRGRGGDRGGFRGGRGGGDRGGFGPGKM 
DSRGEHRQDRRERPY 


9087 


A 


i hi & 

LD /U 


070 ■ 


an a v a cm a pxjrpnnrr wx>xr\mny c a a dcci /■it/^a /tt 
OO/US^OOArxlrr ALrrJ\Ji.VOOLoAArbbVJblJiVLL 

WAGARQHGRNWRKRETSPGTQGPLPPVPRA^PP 

GPDG\PHAIAPTLSWAIPRQQCSPQPGRLNALPPD 

RCSGPHFGDRAPESCFPGACSVSGACAFKGTRPA 

CPPQEPSLRSSRNRLREGQTFGRMEI 


2988 


A 


1 


1011 


MGNDSVSYEYGDYSDLSDRPVDCLDGACLAIDP 

LRVAPLPLYAAIFLVGVPGNAMVAWVAGKVAR 

RRVGATWLLHLAVADLLCCLSLPILAVPIARGGH 

WPYGAVGCRALPSIILLTMYASVLLLAALSADLC 

FLALGPAW\CLRFS/GACGVQVACGAAWTLALL 

LTVPSAIYRRLHQEHFPARLQCVVDYGGSSSTEN 

A VT A TD T7T VCiTH flDT \T A V A CPUO ATT mi/ A ADOP 

AVI AlKr JLr Kjr Lur JL V A V AoOrlo AIXU WAAKKL 

RPLGTAIWGFFVCWAPYHLLGLVLTVAAPNSA 
LLARALRAEPLIVGLALAHSCLNPMLFLYFGRAQ 
LRRSLPAACHWALRESQGQDESVDSKKSTSHDL 
VSEMEV, 


2989 


A 


27 


4074 


KSQLFCFWVGKAGDILSGDQDKEQKDPYFVETP - 

YGYQLDLDFLKYVDD1QKGNTIKRLNIQKRRKPS 

VPCPEPRTTSGQQGrWTSTESLSSSNSDDNKQCP 

NFLIARSQVTSTPISKPPPPLETSLPFLTIPENRQLP 

PPSPQLPKHNLHVTKTLMETRRRLEQERATMQM 

TPGEFRRPRLASFGGMGTTSSLPSFVGSGNHNPA 

KHQLQNGYQGNGDYGSYAPAAPTTSSMGSSERH 

SPLSSGISTPVTNVSPMHLQHIREQMAIALKRLBCE 

LEEQVRTIPVLQViaSVLQEEKRQLVSQLKNQRA 

ASQINVCGVRKRSYSAGNASQLEQLSRARRSGG 

ELYIDYEEEEMETVEOSTORIKEFROL\TADMOA 

LEQKIQDSSCEASSELRENGECRSVAVGAEENMN 

DIWYHRGSRSCKDAAVGTLVEMRKCGVSVTEA 

MLGVMTEADKEIELQQQTIESLKEKIYRLEVQLR 

ETTHDREMTKLKQELQAAGSRKKVDKATMAQP 
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SEQU) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino . 

acid residue of 

peptide 

sequence' 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine OCysteine, D=Aspartic Acid, 
E=Glu(amic Acid, ^Phenylalanine, G^lycine, H=*Histidine, 
I=Isoleucine, K=Lysine, Lr=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, \V=Tryptophan, V=Tyrosine, 
X^Un known, *=Stop codon, /^possible nucleotide deletion, 
\— possible nucleotide insertion 










LVFSKVVEAVVQTRDQMVGSHMDLVDTCVGTS 

VETNSVGISCQPECKNKVVGPELPMNWWIVKER 

VEMHDRCAGRSVEMCDKSVSVEVSVCETGSNTE 

ESVNDLTLLKTNLNLKEVRSIGCGDCSVDVTVCS 

PKECASRGVNTEAVSQVEAAVMAVPRTADQDT 

STDLEQ\^QrTNTETATLffiSCTNTCLSTLDKQTS 

TQWETRTVAVGEGRVKDINSSTKTRSIGVGTLL 

SGHSGFDRPSAVKTKESGVGQININDNYLVGLK 

MRT1ACGPPQLTVGLTASRRSVGVGDDPVGESLE 

NPQPQAPLGMMTGLDHY1ERIQKLLAEQQTLLA 

ENYSELAEAFGEPHSQMGSLNSQLISTLSSINSVM 

KSASTEELRNPDFQKTSLGKITGSYLGYTCKCGG 

LQSGSPLSSQTSQPEQEVGTSEGKPISSLDAFPTQ 

EGTLSPVNLTDDQ1AAGLYACTNNESTLKSIMKK 

KDGNKD SN G AKKNLQFVGING G YETTS SDD S S S 

DESSSSESDDECDVIEYPLEEEEEEEDEDTRGMAE 

GHHAVNDEGLKSARVEDEMQVQECEPEKVEIRE 

RYELSEKMLSACNLLKNTINDPKALTSKDMRFC 

LNTLQHEWFRVSSQKSAIPAMVGDYIAAFEAISP 

DVLRYVINLADGNGNTALHYSVSHSNFEIVKLLL 

DADVCNVDHQNKAGYTPIMLAALAAVEAEKDM 

Kl V bbLr OCCjD vN AKA SQ AGQTALML A V SHG Rl 

DMVKGLLACGADVNIQDDEGSTALMCASEHGH 

VEIVKLLLAQPGCNGHLEDNDGSTALSIALEAGH 

KDIAVLLYAHVNFAKAQSPGTPRLGRKTSPGPTH 

RGSFD 


2990 


A 


6? 


1687 


ERLRPGQRAIRGPVPAAGACASLPPRAGPAQGRH 
AALGGAEPGSHLHCGVRLQRREEPGGQQRLLPQ 
RGGSAQTGHQHPGPYECQCPGPQPGGTTPALLSL 
ILEETRGPPASANPDKDHSTQPGTMGRKKIQISRI 
LDQRNRQVTFTORKFGLMKKAYELSVLCDCEIA 
LHFNSATRLFQYASTDMDRVLLKYTEYSEPHESR 
TNTDILETLKRRGIGLDGPELEPDEGPEEPGEKFR 
- RLAGEGGDPALPRPRLYPAAPAMPSPDVVYGAL 
PPPG\CDPSGLGEALPAQSRPSPFRPAAPKAGPPG 
LGHPLFSPSHLTSKTPPPLYLPTEGRRSDLPGGLA 
GPRGGLNTSRSLYSGLQNPCSTATPGPPLGSFPFL 
rOO^rVOAbAWARRVPQPAAPPRRPPQSSIKSER 
LFLRPPGAPATFLRPSPIPCSSPGPWQSLCGLGPPV 
CAGCPWPTAGPGRRSPGGTSPERSPGTARARGDP 
\TSLQAFSEKTHTVTAPLRGGGLEVGGWTQSSAG 
GLLSFFLFVCISTNKNARGVRGPEKK 


2991 


A 


3 .. 


1159 


IPQPLHCASPKEEMSLRCGDAARTLGPRVFGRYF 

CSPVRPLSSLPDKKKELLQNGPDLQDFVSGDLAD 

RSTWDEYKGNLKRQKGERLRLPPWLKTEIPMGK 

NYNKLKNTLRNLNLHTVCEEARCPNIGECWGGG 

EYATATATIMLMGDTCTRGCRFCSVKTARNPPP 

LDASEPYNTAKAIAEWGLDYWLTSVDRDDMP 

DGGAEHIAKTVSYLBCERNPKILVECLTPDFRGDL 

KAlEKVALSGLDVYAHiSrvETV 

NFDOSLRVLrCKA'K''K'VnPnVT<lliCT9TMT OI 

i-^l. JL/yUl^XV V i-fAvJL i_TVXv_LV V yf JJ V AlJJCS. 1 OllVJUo VJ l^VJ JLIN VJ\Lt 

QVYATI^KALREADVDCLTLGQYMQPTRRHLKV 
EEYITPEKFKYWEKVGNELGFHYTASGP\LVRSS 
YKAGEFFLBGsTLV AKRKTKDL 


2992 


A 


3 


1636 


PVPGVPTSPPSCCPQDMQGPWVLLLLGLRLQLSL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AJaninc OCysteinc, D=Aspartic Acid, 
E^lutamic Acid, ^Phenylalanine, G=Glycine, H=H is tiding 
I=Isolcucinc, K«Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q=GIutamine, R=Arginine, S^Sehnej 
"^Threonine, V^Valine, W^Tryptophan, V=Tyrosine, 
X«=Unknown, *=Stop cod on, /"possible nucleotide deletion, 
V=possible nucleotide insertion 










GVIPAEEENPAFWNRQAAEALDAAKKLQPIQKV 

AKNLILFLGDGLGVPTVTATRTLKGQKNGKLGPE 

TPLAMDRFPYLALSKTYNVDRQVPDSAATATAY 

LCGVKANFQTIGLSAAARFNQCNTTRGNEVISV 

MNRAKQAGKSVGWTTTRVQHASPAGTYAHTV 

NIWWYSDADMPASARQEGCQDIATQL1SNMDID 

VELGGGRKYMFPMGTPDPEYPADASQNGIRLDG 

KNLVQEWLAKHQGAWYVWl^RTELMQASLDQS 

VTHLMGLFEPGDTKYEIHRDPTLDPSLMEMTEA 

ALRLLSRNPRGFYLFVEGGR1DHGHHEGVAYQA 

LTEAVMFDDAIERAGQLTSEEDTLTLVTADHSH 

vroruu i ll^lvUoolrOJLAroJS^v^'^^A Y 1 ML YON 

GPGYVFNSGVRPDVNESESGSPDYHQQAGWPLS 

SETHGGEDVAVFARGPQAHLVHGVQEQSFVAH 

VMAFAACLEPYTACDLAPPACTTDAAHPVAASL 

PLLAGTLLLLGASAAP . 


2993 


A 


3 


685 


DAWARLLKMNIILFGKAKPKAPPPSLTDCIGTVD 

SRAESIDKKISRLDAELVKYKDQEKKMREGPAKN 

\A\rw a 7 u\/t x^r\v"D'\/rvxsr\r\j>T\KTf a wtcuctiia 
m VKyis-ALKVJLlSA^JKJKM I byviwWLAVENoHb 1 W\ 

TS\HYTIQSLKDTKTTVDAMKLGVKEMKKA YKQ 

VKIDQiEDLQDQLEDMMEDANEIQEALSRSYGTP 

ELDEDDLEAELDALGDELLADEDSSYLDEAASA 

PAIPEGVPTDTKNKDGVLVDEFGLPQIPAS 


2994 


A 


1710 


161 


RRCELTPFIIKTLILPKSWGAFPEDVVMQHVSSSQ 

SSQRHVQWPGACPGAGEEQPACSQPSLPLTLPSP- 

SHQLQQLMVRGGPAGGQN3V1NVDLQGVGPGLQ 

GSPQVTLAPLPLPSPTSPGFQFSAQPRRFEHGSPS 

Y1QVTSPLSQQVQTQSPTQPSPGPGQALQNVRAG 

APGPGLGLCSSSPTGDFVDASVLVRQISLSPSSGG 

HFVFQDGSGLTQIAQGAQVQLQHPGTPITVRERR 

PSQPHTQSGGT1HHLGPQSPAAAGGAGLQPLASP 

SHITTA>JLPPQISSnQGQLVQQQQVLQGPPLPRPL 

GFERTPGVLLPGAGGAAGFGMTSPPPPTSPSRTA 

VPPGLSSLPLTSVGNTGMKKVPKKLEEIPPASPE 

IviA^MKivi^LJLJJ Y rlnC^bMCjAJLKiiVF K£ YLIELFF 

LQHFQGhMMDFLAFKERLYGPLQAYLRQNDLDI 

EEEEEE\HFEVINDEVKVVARKHGQPGTPVAIAT\ 

QLPPRTSAAFPAQQQPLQVLSDGSTVQLPRLSSL 

GFEDSMC 


2995 


A 


3 ' . 


924 


SAPSGIDASTHAFARCKHPDWRRDPSIPIYGLRQS 
ILLNTRLQDCYVDSPALTNIWMARTCAKQNINAP 
APATTSSWEWRNPLIASSFSLVKXVLRRQLKNK 
CCPPPCKFGEGKLSKRLKHKDDSVMKATQQARK 
RNFISSKSKQPAGHRRPAGGIRESKESSKEKKLTV 

POTYT T7TYP V A T?TJ"\/ A A T\f^ A T Dr^r^O^T A A \\n/T*\ D \ / 

Jvy ujL.cJJiv Y AHJti V AA 1 \\l AJLr kJV ou 1 AA W Ku \K V 
LLPETQKRQQLSEDTLTIHGLPTEGYQALYHAW 
EPMLWNPSGTPKRYSLELGKAIKQKLWEALCSQ 
GAISEGAQRDRFPGRKQPGVHEEPVLKKWPKLK 
SKK 


2996 


-A 


3 ■ 


1713 


GKFGIKPSQRRISGKSTFHSEMEGEDTRDDSLYSI 

LEELWODAEOIKRCOFK1TMKT T SttTTFLNKKTI M 

TEWDYEYKDFGKPVHPSPNLILSQKJFUPHKRDSFG 

KSFKH^DIJBIHNKSNAAKNIJDKTIGHGQW 

NSSYSHHENTHTGVKFCERNQCGKVLSLKHSLS 

QNVKFPIGEKANTCTEFGKIFTQRSHFFAPQKIHT 
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SEQ ID 
NO: 


Method 


Predicted . 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide . 
sequence 


Amino acid sequence (A=AIanine OCysteine, D=Aspartic Acid, 
E^Glutamic Acid, F=Phcnylalanine, G=Glycine, H=Histidine, 
l=Isoleucine, K=Lysine, L«Leucine, M=Methionine, 
N=Asparagine, P=ProIinc, Q=Glutamine, R=Argininc, S^Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Uoknown, *=Stop codon, /= possible nucleotide deletion, 
V=possible nucleotide insertion . 










VEKPHELSKCVNVFTQKPLLSrYLRVHRBEKLYIX 
CTKM/CGKGLHPRNSELIMHEKTHTREKPYKCNE 
\CGKSFFQVSSLLRHQTTHTGEKLFECSECGKGFS 
LNSALNIHQKIHTGERHHKCSECGKAFTQKSTLR 
MHQR1HTGERSYICTQCGQAFIQKAHLIAHQRIH 

TGEKPYECSDCGKSFPSKSQLQMHKRJHTGEKPY 

TrTrpnv a ittvtd cxtt xtttjt/^vctjtytcx' cvTr a tt/ti 
1 JllAjKAr 1 JNKoiNJLIN 1 ri^Jvori I vjrJfcKo Y lUAbUvj 

KAFTDRSNFNKHQTIHTGEKPYVCADCGRAFIQK 

SELITHQR1HTTEKPYKCPDCEKSFSKKPHLKVHQ 

RIHTGEKPYICAECGKAFTDRSNFNKHQTIHTGD 

KPYKCSDCGKGFTQKSVLSMHRNIHT 


2997 


A 


3 


1763 


AASTRTMGSRHFEGIYDHVGHFGRFQRVLYFICA 

FQNISCGIHYLASVFMGVTPHHVCRPPGNVSQVV 

FHNHSNWSLEDTGALLSSGQKDYVTVQLQNGEI 

WELSRCSRNKRENTSSLGYEYTGSKKEFPCVDG 

YIYDQNTWKSTAVTQWNLVCDRKWLAMLIQPL 

FMFGGPTGIGAnTFGYF\SDRLGRRVVLWATSSS 

MFLFGIAAAFAVDYYTTMAARFFLAMVASGYLV 

VGFVYVMEFIGMKSRTWASVHLHSFFAVGTLLV 

ALTG YLVRTWWLYQMILSTVTVPFILCCWVLPE 

TPFWLLSEGRYEEAQK\IVDIMAKWNRASSCKLS 

ELLSLDLQGPVSNSPTEVQKHNLSYLFYNWSITK 

RTLTVWLIWFTGSLGFYSFSLNSVNLGGNEYLNL 

FLLGVVEIPAYTFVC1AMDKVGRRTVLAYSLFC\S 

AJLACu V V M V irl^Kxl Y 1LU V V 1 AMW UKlLr KjAA 

FG\LIYLYTAELYPTIVRSLAVGSGSMVCRLASIL 

APFSVDLSSIWIFIPQLFVGTMALLSGVLTLKLPE 

TLGKRLA TT WEEA AKLESENESKS SKLLLTTNNS 

GLEKTE AITPRD S GLGE 


2998 


A 


3 


1441 


QRPASQLLAPFAAEALPGAPRAAMAQHFSLAAC 

DVVGFDLDHTLCRYNLPESAPLIYNSFAQFLVKE 

KGYDKELLNVTPEDWDFCCKGLALDLEDGNFL . 

KLANNGTVLRASHGTKMMTPEVLAEAYGKKEW 

KHFLSDTGMACRSGKYYFYDNYFDLPGALLCAR 

WDYLTKLNNGQK'TFDFWKDIVAAIQHNYKMS 

AFKENCGIYFPEIKRDPGRYLHSRPESVKKWLRQ 

LKNAGKILLLITSSHSDYCRLLCA\YILGNDiFTDLF 

DIVITNALKPGFFSHLPSQRPFRTLENDEEQEALP 

SLDKPGWYSQGNAVHLYELLKKMTGKPEPKVV 

Y r uUbMiioUirr AKri Y oJN WJfc i VJ^UJlJ3LKuUJc.u 1 

RSQRPEESEPLEKKGKYEGPKAKPLNTSSKKWGS 
FFMDSVLGLENTEDSLVYTWSCKRISTYSTIAIPSI 
EAIAELPLDYICFTRFSSSNSKTAGYYPNPPLVLSS 
DETLISK 


2999 


A 


320 


2417 


LRRRKMTPQSLLQTTLFLLSLLFLVQGAHGRGHR 

EDFRFCSQRNQTHRSSLHYKPTPDLRISIENSEEA 

LTVHAPFPAAHPASRSFPDPRGLYHFCLYWNRH 

AGRLHLLYGKRDFLLSDKASSLLCFQHQEESLAQ 

GPPLLATSVTSWWSPQNISLPSAASFTFSFHSPPH 

TGAHNASVDMCELKRDLQLLSQFLKHPQKASRR 

NATVWKLQPTAGLQDLHIHSRQEEEQSEIMEYS 
VLLPRTLFQRTKGRSGEAEKRLLLVDFSSQALFQ 
DICNSSQVLGEKVLGIVVQNTKVANLTEPVVLTF 
QHQLQPKNVTLQCVFWVEDPTLSSPGHWSSAGC 
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SEQID 
NO: 

- 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A= Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F-Phenylalanine, G=Clycine, H=Histidine, 
I=Isoleucine, KpLysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Prolinc, Q=Glutarnine, R^Argininc, S=Serine, 
T«=Threonine, V<=Valine, W=Tryptophan, Y=Tyrosint, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 










ETVRRETQTSCFCNHLTYFAVLMVSSVEVDAVH 
KHYLSLLSYVGCVVSALACLVTIAAYLCSRVPLP 
CRRKPIUDYmVHMNLLLAVFLLDTSFLLSEPVA 
LTGSEAGCRASA1FLHFSLLTCLSWMGLEGYNLY 
RLVVEVFGTYVPGYLLKLSAMGWGFPIFLVTLV 

Ai^vJJ VJJIN I vxrlLLA VriKlrlilJ VI 1 roML W LKJJoL. 

VSYIT^GLFSLVFLFmiAMLATMVVQILRLRPH 
TQKWSHVLTLLCLSLVLG\LPWALIFFSFASGTFQ 
LVVLYLFSnTSFQGFLIFIWYWSMRLQARGGPSP 
LKSNSDSARLPISSGSTSSSRJ 


3000 


A 


66 


1003 


SRGQLDAGQSSEQHGGNRQPEQSRSRSSSSSSSP 

RRSRSAAEPAMALSMPLNGLKEEDKEPLIELFVK 

AGSDGESIGNCPFSQRLFMLWLKGVVFSVTTVD 

LKRKPADLQNLAPGTHPPFITFNSEVKTDVNKIEE 

FLEEVLCPPKYLKLSPKHPESNTAGMDIFAKFSA 

VTyXTCDDT A XTC A T rn /"> 7 T T/"TT nVT nEVT XTOTST nT*\ 

Y lisJN bKr JiAJNJbAJLrC,KuJUL.K 1 Lv^KUJb Y LN br Lru 
EIDENSMEDIKFSTRKFLDGNEMTLADCNLLPKL 
HIVKVVAKKYRNFDIPKEMTGJWRYLTNAYSRD 
EFTNTCPSDKEVEIVAYSDVAKRLHQVKSRLLKE 
VSFMSSP 


3001 . 


A 


779 


2006 


LALTFRSALSTLPGSPMTSSGSPDLQLAWGPSLLP 

rtrro V Wbr ALrbCr AGPCrLLPLbD I QCj W WGPN 

WLAPPSAALCRPDAAVWPDLPSSNILLVTPPPAK 

*SAVAV*PCPRGAHSLERAARQYTISGSSTSQSGK 

CSKRDTKCCAVTTSWGCFWQKHWKGDEDSGW 

AFQEGSHLGEGHL 


3002 


A 


909 


2799 


VEEAWTVWLHWGVRECLLEEETNQKEEAASSN 

WTKARGPFWQEDWVWDMRLKMTTRNFPEREV 

PCDVEVERFTREVPCLSSLGDGWDCENQEGHLR 

QSALTLEKPGTQEAICEYPGFGEHL1ASSDLPPSQ 

RVLATNGFHAPDSNVSGLDCDPALPSYPKSYAD 

KRTGDSDACGKGFWISMEVIHGRNPVREKPYKY 

PESVKSFNHFTSLGHQOMKRGICKSYEGKNFENI 

FTLSSSLNENQRNLPGEKQYRCTECGKCFKRNSS 

* LVLHHRTHTGEKP YTCNECGKSFSKNYNLI VHQ 

RIHTGEKPYECSKCGKAFSDGSALTQHQRIHTGE 

KPYECLECGKTFNRNSSLELHQRTHTGEKPYRCN 

ECGKPFTDISHLTVHLRIHTGEKPYECSKCGKAF 

RDGSYLTQHERTHTGEKPFECAECGKSFNRNSHL 

IVHQKIHSGEKPYECKJECGKTFEESAYLIRHQRIH 

1 uclsJr 1 UUJN KLr KN 1 AOl^iKri^K I Jet 1 \JE,isJr Y 

ECNQCGKAFRDSSCLTKHQR1HTKETPYQCPECG 
KSFKQNSHLAVHQRLHSREGPSRCPQCGKMFQK 
SSSLVRHQRAHLGEQPMET*WLGAT*VFQFTLTP 
VFRRRVLDLTPLWSVEKNPLSYPVN 


3003 


A 


2 


1489 


SLTEHLSFFQPTAHSLTSLLGTMTTCSRQFTSSSS 
MKGSCGIGGGIGGGSSR1SSVLAGGSCRAPSTYG 
GGLSVSSRFSSGGACGLGGGYGGGFSSSSSFGSG 
FGGGYGGGLGAGFGGGLGAGFGGGFAGGDGLL 
VGSEKVTMQNLNDRLASYLDKVRALEEANADL 

ATffiNAQPILQIDNARLAADDFRTKYEHELALRQ 
TVEADVNGLRRVLDELTLARTDLEMQffiGLKEE 
LAYLRKNH*EEMLALRGQTGGEVNVETDAAPG 
VDLSCILNEMRNQYEQMAEKNRRDAETWFLSKT 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location - 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-=Alaninc OCysteine, D^Aspartic Acid, 
E=Glutamic Acid, F=Phcnylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Lcucine, M=Mcthionine, 
N»Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










EELNKEVASNSELVQSSRSEVTELRRVLQGLEIEL 
QSQLSMKASLENSLEETKGRYCMQLSQIQGLIGS 
VEEQLAQLRCEMEQQSQEYQILLDVKTRLEQEIA 
TYRRLLEGEDAHLSSQQASGQSYSSREVFTSSSSS 
SSRQTRPILKEQSSSSFSQGQSS 


3004 " 


A 


2 


940 


GCAPDTRFFVPEPGGRGAAPWVALVARGGCTFK 

DKVLVAARRNASAWLYNEERYGNITLPMSHAG 

TGNIVVIMISYPKGREILELVQKGIPVTMTIGVGT 

RHVQEFISGQSVVFVAIAFITMMIISLAWLIFYYIQ 

RFLYTGSQIGSQSHRKETKKVIGQLLLHTVKHGE 

KGIDVDAJENCAVCDENFKVKDnRILPCKPDFHRJC 

IDPWLLDHRTCPMCKLDVIKALGYWGEPGDVQE 

MPAPESPPGRDPAANLSLALPDDDGSDESSPPSA 

SPAESEPQCDPSFKGDAGENTALLEAGRSDSRHG 

GPIS 


3005 


A 


184 


2552 


TMTIHQFLLLFLFWVCLPHFCSPEIMFRRTPVPQQ 

RILSSRVPRSDGKILHRQKRGWMWNQFFLLEEY 

TGSDYQYVGKLHSDQDKGDGSLKYILSGDGAGT 

LFIIDEKTGDIHATRMDREEKAFYTLRAQAINRR 

TLRPVEPESEFVIKIHDINDNEPTFPEEIYTASVPE 

MSWGTSWQVTATDADDPSYGNSARVIYSrLQ 

GQPYFSVEPETGURTALPNMNRENREQYQWIQ 

AKDMGGQMGGLSGTTTVNITLTDVNDNPPRFPQ 

NTIHLRVLESSPVGTAIGSVKATDADTGKNAEVE 

YRnbGDGTDMFDlVTEKDTQEGHTVKKPLDYES 

RRLYTLBCVEAENTHVDPRFYYLGPFKDTTIVKISI . 

EDVDEPPVFSRSSYLFEVHEDIEVGTIIGTVMARD ■ 

PDSISSPIRFSLDRHTDLDRIFNIHSGNGSLYTSKP 

LDRELSQWHNLTVIAAEINNPKETTRVAVFVRIL 

DANDNAPQFAVFYDTFVCENARPGQLIQTISAVD 

KDDPLGGQKFFFSLAAVNPNFTVQDNEDNTARIL 

TRKKGFNRHEISTYLLPVVISDNDYPIQSSTGTLTI 

RVCACDSQGNMQSCSAEALLLPAGLSTGALIAIL 

LCEILLVIVYLFAALKRQRKKEPLILSKEDIRDNIV 

SYNDEGGGEEDTQAFDIGTLRNPAAffiEKKLRJRD 

IIPETLFIPRRTPTAPDNTDVRDFINERLKEHDLDP 

TAPPYDSLATYAYEGNDSIAESLSSLESGTTEGD 

QNYDYLREWGPRFNKLPQKYGGGESDKDS 


3006 


A 


2 


541 


GRVDKTWWGKSVGIMLTELEKALNSIIDVYHKY 

SLIKGNFHAVYRDDLKKLLETECPQYIRKKGAD 

VWFKELDINTDGAVNFQEFLILVIKMGVAALNSn 

DVYHKYSLKGNFHAVYRDDLQKLLETECPQYI 

RKKGADVWFKELDINTDGAVNFQEFLILVIKMG 

VGSPQKKVASYF. 


3007 


A 


1 


1253 


MYEGIRCLLKALLGFVSLAIGTLYCPRQYRPFPG 

SLGIEAIKVTEPIPDSYYRDMATWPTHAPSVEEG 

GQGRFGNQADHFLGSLAFAKLLNRSLAVPSWIE 

YQHHKPPFTNLHVSYQKYFKLEPLQAYHRVISLE 

DFMEKLAPTrTO^PEKRVAYCFEVAAQRSPDKKT 

CP3VKEGNPFGPFWDQFHVSFNKSELFTGISFSAS 

V~D TTf^WTUC^D T?OT>T/ r T7UTD\ 71 AT TIP A Tl A AT?tl\n rrTjn t> 

Y KbCj W d^ivr or JsJbHP VLALr GArAQrPVLbbriRP 

LQKYMVWSDEMVKTGEAQIHAHLVRPYVGIHL 

RIGSDWKNACAMLKDGTAGSHFMASPQCVGYS 

RSTAAPLTMTMCLPDLKEIQRAVKLWVRSLDAQ 

SVYVATDSESYVPELQQLFKGKVKWSLKPEVA 
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SEQ ED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A= Ala nine C=Cysteine, D=Aspartic Acid, 
E=Clutamic Acid, ^Phenylalanine, G«Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, ^Leucine, M=Methionine, 
N«Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T^Thrconine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, ^Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 










QVDLYILGQADHFIGNCVSSFTAFVKRERDLQGR 
PSSFFGMDRPPKLRDEF 


3008 


A 


3136 


1898 


TARGGGSEPGPTMAANYSSTSTRREHVKVKTSS 

QPGFLERLSETSGGMFVGLMAI^LSFYLIFTNEG 

RALKTATSLAEGLSLVVSPDSIHSVAPENEGRLV 

HnGALRTSKLLSDPNYGVHLPAVKLRRHVEMY 

QWVETEESREYTEDGQVKKETRYSYNTEWRSEII " 

NSKNFDREIGHKWRAMAGESFMATAPFVQIGRF 

FLSSGLTDKVDNFKSLSLSKLEDPHVDIIRRGDFF 

YHSENPKYPEVGDLRVSFSYAGLSGDDPDLGPA 

HVvTvIARQRGDQLWFSIKSGDTLLLIJ^ 

AEEVFHRELRSNSMKTWGLRAAGWMAMFMGL 

NLMTRILYTLVDWFPVFRDLVNIGLKAFAFCVAT 

SLTLLTVAAGWLFYRPLWALL1AGLALVPILVAR 

TRVPAKKLE 


3009 


A 


93 


659 


DAAVAMTAQGGLVANRGRRFKWAIELSGPGGG 

SRGRSDRGSGQGDSLYPVGYLPKQVPDTSVQET 

DRILVEKRCWDIALGPLKQffMNLFIMYMAGNTI 

SIFPTMMVCMMAWRPIQALMAISATFKMLESSS 

QKFLQGLVYLIGNLMGLALAVYKCQSMGLLPTH 

ASDWLAFIEPPERMEFSGGGLLL 


3010 


A 


2 


1041 


LIDSAKARYWTQRGTWVYDNALLLLLKCLWSN 

VWEGTMASSNTVLMRLVASAYSIAQKAGMIVR 

RVIAEGDLGIVEKTCATDLQTKADRLAQMSICSS 

LARKFPKLTIIGEEDLPSEEVDQELIEDSQWEEILK 

QPCPSQYSAIKEEDLVVWVDPLDGTKEYTEGLL 

DNVTVLIGIAYEGKAIAGVINQPYYNYEAGPDAV 

LOK 1 1 WO VLGLuAFGFQLKEVPAGKrilri 1 TKSH 

SNKLVTIX^VAAMNPDAVLRVGGAGNKIIQLIEG 

KASAYVFASPGCKKWDTCAPEVILHAVGGKLTD 

IHGNVLQYHKDVKHMNSAGVLATLRNYDYYAS 

RVPESIKNALVP 


3011 


A 


.291 


.1452 


SPQKTMRSHTITMTTTSVSSWPYSSmiMRFITNH 

SDQPPQNFSATPNVTTCPMDEKLLSTVLTTSYSVI 

FP/GLVGMULYVFLGIHRKRNSIQIYLLNVAIAD 

LLLIFCLPFRIMYHINQNKWTLGVILCKVVGTLFY . 

MNMYISIILLGFISLDRYIKINRSIQQRKAITTKQSI 

YVCCIVWMLALGGFLTMnLTLKKGGHNSTMCF 

H Y KUKHN AKGKAJu^NF IL VVMF WLIFLLIILS YIKI 

GKNLLRISKRRSKFPNSGKYATTAKNSFIVLIIFTI 

CFVPYHAFRFIYISSQLNVSSCYWKEIVHKTNEIM 

LVLSSTOSCLDPVMYFLMSSNmKIMCQLLFRRF 

QGEPSRSESTSEFKPGYSLHDTSVAVKIQSSSKST 


3012 


A ■ 


246 


1346 


TEPVGYTKAEEPIAMRSLGALLLLLSACLAVSAG 

PVPTPPDMQVQE>nnsnSRIYGKWYNLAIGSTCPW 

LKKIMDRMTVSTLVLGEGATEAEISMTSTRWRK 

GVCEETSGAYEKTDTDGKFLYHICSKWNITMESY 

VVHTNYDEYAIFLTKKFSRHHGPTITAKLYGRAP 

QLRETLLQDFRVVAQGVGIPEDSIFTMADRGECV 

Jr\xb^llr\l^iJLlJt^KVJ^ Itv 
TKKEDSCOLGY^AGPPMGVrrSRYFYNGTSMAC 
ETFQYGGCMGNGNNFVTEKECLQTCRTVAACN 
LPIVRGPCRAFIQLWAFDAVKGKCVLFPYGGCQ 
GNGNKFYSEKECREYCGVPGDGDEELLRFSN 


3013 


A . 


67 


379 


RQMALLKAKKDLISAGLKEFSVLLlSiQQVFNDPL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, iNAspartic Acid, 
E*=Glutamic Acid, ^Phenylalanine, (^Glycine, H=Histidine, 
I^Isoleucine, K^Lysine, L=Lcucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^Serine, 
T=Threoninc, V=VaIinc, W=Tryptophan, Y=Tyrosine, 
X=Unkno\vn, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










VSEEDMVTVVEDWMNFYINYYRQQVTGEPQER 
DKALQELRQELNTLANPFLAKYRDFLKSHELPSH 

rrroo 


3014 


A 


1 


373 


GTSWSTLRAVMSASVVSVVSRVLEEYLSSTPQRL 
KLLDAYLLYILLTGALQFGYCLFVLTFHFNSLLLF 
FFFCVGSFHSNVYFLLFTLSFLCFLFIAYFFLIRFFS 
LFIWFFHVFFIELSLFYF 


3015 


A 


2 


1321 


AAAEGTAPSPGRVSPPTPARGEPEVTVEIGETYLC 

RRPDSTWHSAEVIQSRVNDQEGREEFYVHYVGF 

NRRLDEWVDKNRLALTKTVKDAVQKNSEKYLS 

ELAEQPERKITRNQKRKHDEINHVQKTYAEMDP 

TTAALEKEHEAITKVKYVDKIHIGNYEIDAWYFS 

PFPEDYGKQPKLWLCEYCLKYMKYEKSYRFHLG 

QCQWRQPPGKEIYRKSNISVYEVDGKDHKIYCQ 

NLCLLAKLFLDHKTLYFDVEPFVFYILTEVDRQG 

AHIVGYFSKEKESPDGNNVACILTLPPYQRRG YG 

WSWVLLEILRDFRGTLSIKDLSQMTSITQNDIIST 
LQSLNMVKYWKGQHVICVTPKLVEEHLKSAQY 
KKPPITGG WG A A VCRGRWG SVSIWTGRSQGLLI 
AVT 


3016 


A 


2 


1321 


AAAEGTAPSPGRVSPPTPARGEPEVTVEIGETYLC 

RRPDSTWHSAEVIQSRVNDQEGREEFYVHYVGF 

NRRLDEWVDKNRLALTKTVKDAVQKNSEKYLS 

ELAEQPERKITRN QKRKHDEINHVQKTYAEMDP 

TTAALEKEHEAITKVKYVDKIHIGNYEIDAWYFS 

PFPEDYGKQPKLWLCEYCLKYMKYEKSYRFHLG 

QCQWRQPPGKEIYRKSNISVYEVDGKDHK1YCQ 

NLCLLAKLFLDHKTLYFDVEPFVFYILTEVD'RQG 

AHIVGYFSKEKESPDGNNVACILTLPPYQRRGYG 

JSJrJUl/\ri i zSLrOivJLxio 1 VOoriiiS^JjOlJJLOJsJUb i Ko I 

WSWVLLEILRDFRGTLSDCDLSQMTSITQNDIIST 

LQSLNMVKYWKGQHVICVTPKLVEEHLKSAQY 

KECPPITGGWGAAVCRGRWGSVSIWTGRSQGLLI 

AVT 


3017 


A 


38 


704 


EAHPGGQLGSERNGVRMDEDVLTTLKILnGESG 
VGKS SLLLRFTDDTFDPELAATIG VDFKVKTIS VD 

OIn JS^vlSJ^Al WU J. AUvilKrK 1 L Irbi y KajAI^OVIL. 

VYDVTRRDTFVKLDNWLNELETYCTRNDIVNM 
LVGNKIDKENREVDRNEGLKFARKHSMLFIEAS 

VKLSHREEGQGGGACGGYCSVL 


3018 


A 


2640 


2861 


APVLILQMVKLSIVLTPQFLSHDQGQLTKELQQH 
VKSVTCPCEYLRKVSECRQMGPGALEQFPGLSC . 
HTSHSG 


3019 


A 


1307 


711 . 


PGITMAASLVGKKIVFVTGNAKKLEEVVQILGDK 

QGPVLVEDTCLCFNALGGLPGPYIKWFLEKLKPE 
GLHQLLAGFEDKSAYALCTFALSTGDPSQPVRLF 
RGRTSGRIVAPRGCQDFGWDPCFQPDGYEQTYA 
EMPKAEKNAVSHRFRALLELQEYFGSLAA 


3020 


A 


1202 


180 


VSCLPTSCKMITLNNQDQPVPFNSSHPDEYKIAA 
LVFYSCIFnGLFVNITALWVFSCTTKKRTTVTIYM 
MNV AL VDLIFIMTLPFRMFYY AKDEWPFGE YFC 
QILGALTWYPSIALWLLAFISADRYMAIVQPKY 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide ' 
sequence 


Amino acid sequence (A= Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidinc, 
I«lsoleucine, K=Lysine, L=Leucine, M=Mcthionine, 
N=Asparagine, P=Proline, Q=Glutnmine, R^Argininc, S=Serine, 
T=Threonine, V=Valinc, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *~Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 










AKELKNTCKAVLACVGVWIMTLri-rrPLLLLYK 

DPDKI)STPATCLKISDnYLKA\rNVLNLTRLTFFF 

LIPLFIMIGCYLVnHNLLHGRTSKLKPKVKEKSIRl 

nTLLVQVLVCFMPFHICFAFLMLGTGENSYNPW 

GArTTTLMNLSTCLDVILYYIVSKQFQARVISVM 

LYRNYLRSMRRKSFRSGSLRSLSNTNSEML 


3021 . 


A 


27 


1897 


EEFCTWIAVRVGEMETAPKPGKDVPPKKDKLQT 
KRKKPRRYWEEETVPTTAGASPGPPR2STKKNREL 
RPQRPK^AYILKXSRJSKKPQVPKKPREWKNPES 
QRGLSGAQDPFPGPAPVPVEVVQKFCRIDKSRKL 
PHSKAKTRSRLEVAEAEEEETSIKAARSELLLAEE 
PGFLEGEDGEDTAKICQADIVEAVDIASAAKHFD 
LNLRQFGPYRLNYSRTGRHLAFGGRRGHVAALD 
WVTKKLMCEINrVMEAVRDIRFLHSEALLAVAQN 
. 'RWLfflYDNQGIELHCIRRCDR VTRLEFLPFHFLLA 
TASETGFLTYLDVSVGKIVAALNARAGRLDVMS 
QNPYNAVIHLGHSNGTVSLWSPAMKEPLAKILC 
HRGGVRAVAVDSTGTYMATSGLDHQLKIFDLRG 
TYQPLSTRTLPHGAGHLAFSQRGLLVAGMGDVV 
NIWAGQGKASPPSLEQPYLTHRLSGPVHGLQFCP 
FEDVLGVGHTGGITSMLVPGAGEPNFDGLESNPY 
RSRKQRQEWEVKALLEKVPAELICLDPRALAEV 
DVISLEQGKKEQIERLGYDPQAKAPFQPKPKQKG 
RSSTASLVKRKRKVMDEEHRDKVRQSLQQQHH 
KEAKAKPTGARPSALDRFVR 


3022 


A 


1 


2249 


MTAQDSNTSAHAQRDGPELPASSSWRSFWPLSC 

LSSPPVSAVEVATEGRDREVAKVGQRFCDTTSGE 

LRQARDRDCCVRMPAPVGRRSPPSPRSSMAAVA . 

LRDS AQGMTFED V AIYFSQEE WELLDESQRFL YC 

DVMLENFAHVTSLGYCHGMENEAIASEQSVSIQ . 

VRTSKGNTPTQKTHLSEIKMCVPVLKDILPAAEH 

QTTSPVQKSYLGSTSMRGFCFSADLHQHQKHYN 

EEEPWKRKVDEATFVTGCRFHVLNYFTCGEAFP 

APTDLLQHEATPSGEEPHSSSSKHIQAFFNAKSYY 

KWGEYRKASSHKHTLVQHQSVCSEGGLYECSK , 

CEKAFTCKNTLVQHQQfflTGQKMFECSECEESFS 

KKCHLILHKIfflTGERPYECSDREKAFIHKSEFIHH 

QRRHTGGVRHECGECRKTFSYKSNLIEHQRVHT 

GERPYECGECGKSFRQSSSLFRHQRVHSGERPYQ 

CCECGKSFRQIFNLIRHRRVHTGEMPYQCSDCGK 

SFSCKSELIQHQRIHSGERPYECRECGKSFRQFSN • 

LIRHRSIHTGDRPYECSECEKSFSRKFILIQHQRVH 

TGERPYECSECGKSFTRKSDLIQHRRIHTGTRPYE 

GSECGKSFRQRSGLIQHRRLHTGERPYECSECGK 

SFSQSASLIQHQRVHTGERPYQCCECGKSFRQIFN 

LIRHRRVHTGEMPYQCSDCGKSFSCKSELIQHRRJ 

HSGERPYECSECGKSFSRKSNLIRHRRVHTEERP 


3023 


A 


3148 


634 


AAGALRCLAAFPRAEPASRGRQSSPARACAASR 
AERATAAAMAHRCLRLWGRGGCWPRGLQQLL 
VPGGVGPGEQPCLRTLYRFVTTQARASRNSLLTD 

VMGEKKESKPAATTRSSGGGGGGGGKRGGKXD 
DSHWWSRPQKGDEPWDDKDFRMFFLWTALFWG 
GVMFYLLLKRSGRE1TWKDFVNNYLSKGVVDRL 
EVVNKRFVRVTFTPGKTPVDGQYVWFNIGSVDT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G=Glycinc, H^Histidine, 
I=Isotcucine, K=Lysine, LNLeucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^Serine, 
•^Threonine, V=Va!ine, W=Tryptophan, Y*=Tyrosine, 
X=Unknown, *=Stop codon, /possible nucleotide deletion, 
^possible nucleotide insertion 










FEI04LETLQQELGIEGENRVPVVYIAESDGSFLLS 

MLPTVLIIAFLLYTIRRGPAAIGRTGRGMGGLFSV 

GETTAKVLKDEIDVKFKDVAGCEEAKLEIMEFV 

NFLKNPKQYQDLGAKIPKGAILTGPPGTGKTLLA 

KATAGEANVPFriVSGSEFLEMFVGVGPARVRDL 

FALARKNAPCILFE>EIDAVGRKRGRGNFGGQSE 

QENTLNQLLVEMDGFNTTTKWILAGTNRPDIL^ 

PALLRPGRFDRQIFIGPPDIKGRASIFKVHLRPLKL 

DSTLEKDKLARKLASLTPGFSGADVANVCNEAA 

LIAARHLSDSINQKHFEQAIERVIGGLEKKTQVLQ 

PEEKKTVAyHEAGHAVAGWYLEHADPLLKVSII 

PRGKGLGYAQYLPKEQYLYTKEQLLDRMCMTL 

GGRVSEEIFFGRJTTGAQDDLRKVTQSAYAQIVQ 

FGMNEKVGQISFDLPRQGDMVLEKPYSEATARLI 

DDEVRILINDAYKJR.TVALLTEKKADVEKVALLL 

LEKEVLDKNDMVELLGPRPFAEKSTYEEFVEGT 

GSLDEDTSLPEGLKDWNKEREKEKEEPPGEKVA 

N 


3024 


A 


274 


1455 


LRACSLPSMSALEKSMHLGRLPSRPPLPGSGGSQ 

SGAKMRMGPGRKRDFSPVPWSQYFESMEDVEV 

ENETGKDTFRVYKSGSEGPVLLLLHGGGHSALS 

WAVFTAAnSRVQCRIVALDLRSHGETKVKNPED 

LSAETMAKDVGNVVEAMYGDLPPP1ML1GHSMG 

GA1AVHTASSNLVPSLLGLCMIDVVEGTAMDAL 

NSMQNFLRGRPKTFKSLENAIEWSVKSGQIRNLE 

SARVSMVGQVKQCEGITSPEGSKSrVEGIIEEEEE 

DEEGSESISKJRKKEDDMETKKDHPYTWRIELAKT 

EKYWDGWFRGLSNLFLSCPIPKLLLLAGVDRLD 

KDLTIGQMQGKFQMQVLPQCGHAVHEDAPDKV 

AEAVATFLIRHRFAEPIGGFQCVFPGC 


3025 


A 


621 


306 


YHGGQRGRAGGSFRSVQGWGGQLRNPFRTSKSL 
SWKGLSSLLFPLYNLQMGRPRDRKELGRGHSPP 
HLEGPHMLPSGAARWRWLEAPVLVLEPLVLRPA 
AAPTP 


3026 


A 


1533 


454 


AKVPQSTREEKRENGLEARSPAINLMGFNVEEM 
YEAHAWIQRILSLQNKOrnffiN^ 
SQLQKTSSVS1TEIISPGRTELEIEGARADLIEWM 
N1EDMLCKVQEEMARKKERGLWRSLGQWTIQQ 
QKTQDEMKENT3FLKCPVPPTQELLDQKKQFEKC 
GLQVLKVEKJDNEVLMAAFQRKKKMMEEI<XHR 
. QPVSHRLFQQVPYQFCNVVCRVGFQRMYSTPCD 
PKYGAGIYFTKNLKNLAEKAIOQSAADKLIYVFE 
AEVLTGFFCQGHPLN1ATPPLSPGAIDGHDSVVD 
NVSSPETFVIFSGMQAIPQYLWTCTQEYVQSQDY 
SSGPMRPFAQHPWRGFASGSPVD 


3027 


A 


179 


703 


PFHLGASSNTFRLQVQTQESKAQKEVKMGFIFSK 
SMNESMK^QKEFMLMNARLQLERQLIMQSEMR 
ERQMAMQIAWSREFLKYFGTFFGLAAISLTAGAI 
KKKKPAFLVP3VPLSFILTYQYDLGYGTLLERMK 
GEAED1LETEKSKLQLPRGMITFESIEKARICEQSR 
rriUJv 


3028 


A 


876 


1226 


AVGKJEPESSSTWVRDREGmRSRRSMKMLWKLT 
DNIKYEDCEVSATPARSSVRSQAPSLTLPLLLLSL 
QPAAKRGWDKLSPAQRPSLGFARRTRGRSCRER 
TWMLPSLVSEFLHRD 



233 



VVO 01/57190 ' PCTYUSO 1/04098 



SEQU) 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
' corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A*=AIanine C-Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, HHHistidine, 
I-Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R^Arginine, S^Serine, 
T=Thrconinc, V=VaIine, W=Tryptophan, Y«TyrosIne, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 


3029 


A 


3 


1731 . 


FREGRFGSSCAVAAPLAGFQGLIECGYLAVDSPP 

SCWTPGGSNPAAPLPQALLPPRLPPTVLPFLGPGL 

SGELEMFTLPQKDFRAPTTCLGPTCMQDLGSSHG 

EDLEGECSRKLDQkLPELRGVGDPAMISSNTSYL 

SSRGRMIKWFWDSAEEGYRTYHMDEYDEDKNP 

SGm^GTSENKLCFDLLSWRLSQRDMQRVEPSL 

LQYADWRGHLFLREEVAKFLSFYCKSPVPLRPE 

NVVVLNGGASLFSALATVLCEAGEAFLEPTPYYG 

AITQHVCLYGNIRLAYVYLDSEVTGLDTRPFQLT 

VEKLEMALREAHSEGVKVKGLILISPQNPLGDVY 

SPEELQEYLWAKRHRLHVIVDEVYMLSVFEKSV 

GYRSVLSLERLPDPQRTHVMWATSKDFGMSGLR 

FGTLYTENQDVATAVASLCRYHGLSGLVQYQM 

AOT T R nWTWnVVT PFXTT-TAPT T<f A A UTWCPT7T 

RALGIPFLSRGAGFFIWVDLRKYLLKGTFEEEML " 
LWRRFLDNKVLLSFGKAFECKEPGWFRFVFSDQ 
VHRLCLGMQRVQQVLAGKSQVAEDPRPSQSQEP 
SDQRR 


3030 


A 


1 


584 


PWLPWSDGRAARSSRKCPRSRFPVQVGKMAVST 

TDSQKDMIEIPLPPWQERTDESIETKRARLLYESR 
KRGMLENCILLSLFAKEHLQHMTEKQLNLYDRLI 
NEPSNDWDIYYWATEAKPAPEIFENEVMALLRD 
FAKNKNKEQRLRAPDLEYLFEKPR 


3031 ' 


A 


1177 


359 


SLWPWILMDDSLMQ1SLQLLCVYTANFPNGCSSL 
CWSSCGQHPVQATHRGAVSNSLMLCDLKLASQM 
PLENTTVQQMVFMLLSNLALSHDCKGVIQKSNF 
LQNFLSLALPKGGNKHLSNLTILWLIO.LLNISSGE 

nrinrMVATT PT rV^PT FiT T TTHTVyfCt^V"fc r IJV'CCT>T T DT T T 

FHWCFSPA>^KJiANEKVITVljWVCL^ 
AQRIGAAALWALIYNYQKAKTALKSPSVKRRVD 
EAYSLAKKTFPNSEANPLNAYYLKCLENLVQLL 
NSS 


3032 


A 


2 


1242 

r 


GISGRPPRPAKRRMGKNPVRPPRALPPVPSQDDIP 
LSRPKKICKPRTKNTPASASLEGLAQTAGRRPSEG 
NEPSTKELKEHPEAPVQRRQKKTRLPLELETSST - 
QKKS S S S SLLRNENG ID AEP AEE A VIQKPRRKTK 
KTQPAELQYANELGVEDEDIITDEQTTVEQQSVF 
TAPTGISQPVGKVFVEKSRRFQAADRSELIKTTEN 
IDVSMDVKPSWTTRDVALTVHRAFRMIGLFSHG 
FLAGCAVWNTVVIYVLAGDQLSNLSNLLQQYKT 

T AYPFH^T T VT T T AT ^TT^i A T7TYR TTiP A TiTTQV A TPXTP 

LALDPTALASFLYFTALILSLSQQMTSDRIHLYTT 
SSVNGSLWEAGIEEQILQPWIVVNLVVALLVGLS 
. WLFLSYRPGMDLSEELMFSSEVEEYPDKEKEIKA 
SS 


3033 


A 


3 


1436 


TATSGGIWLRRKWRCHWPRPLPQSCVGTEGGLQ 

VRDTSSRIAKGGVDHTKMSLHGASGGHERSRDR 

RRSSDRSRDSSHERTESQLTPCIRNVTSPTRQHHV 

EREKDHSSSRPSSPRPQKASPNGSISSAGNSSRNS 

SQSSSDGSCKTAGEMVFVYENAKEGARNIRTSER 

VTLIVDNTRFVVDPS 

HNFTRPNEKGEYEVAEGIGSTVFRAILDYYKTGD 
RCPDGISIPELREACDYLCISFEYSTIKCRDLSALM 
HELSNDGARRQFEFYLEEMILPLMVASAQSGERE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine OCysteine, EMAspartic Acid, 
E=Glutamic Acid, F=PhenyIalanine, G=GIycine, H-Histidine, 
I=IsoIcucinc, K=Lysine, L=Leucine^ M=Methioninc, 
N=Asparagine, P=Proline; Q^GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W^Tryptophan, Y=Tyrosinc, 
X«Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 










CHiVVLTDDDVVDWDEJEYPP^ 

LYRFFKYmNRDVAKSVLKERGLKKTRLGIEGYP 

TYKEKVKKRPGGRPEVIYhTA^QRPFIRMSWEKE 

EGKSRHVDFQCVKSKSITNLAAAAADIPQDQLV 

VMHPTPQVDELDELPIHPPSGNSDLDPDAQNPML 


3034 


A 


3 . 


1972 


SSLAQHRSVAVLGWPAGWAAARARPAMQGGN 

SGVRKREEEGDGAGAVAAPPAEDFPAEGPDPEY 

DESDVPAEIQVLKEPLQQPTFPFAVANQLLLVSL 

LEHLSHVHEPNPLRSRQVFKLLCQTFIKMGLLSSF 

TCSDEFSSLRLHHNRAITHLMRSAKERVRQDPCE 

DISR1QKIRSREVALEAQTSRYLNEFEELAELGKG 

GYGRVYKVRNKLDGQYYAIKKILIKGATKTVCM : 

KVLREVKVLAGLQHPNIVGYHTAWIEHVHVIQP 

RADRAAIELPSLEVLSDQEEDREQCGVKNDESSS 

SSIIFAEPTPEKEKRFGESDTENQNNKSVKYTTNL 

VIRESGELESTLELQENGLAGLSASSIVEQQLPLR 

RNSHLEESFTSTEESSEENVNFLGQTEAQYHLML 

HIQMQLCELSLWDW1VERNKRGREYVDESACPY 

VMANVATKIFQELVEGVFYIHNMGIVHRDLKPR 

NIFLHGPDQQVKIGDFGLACTDILQKNTDWTNR 

NGKJITPTHTSRVGTCLYASPEQLEGSEYTjAKSD 

MYSLGVVLLELFQPFGTEMERAEVLTGLRTGQL 

PESLRKRCPVQAKYIQHLTRRNSSQRPSAIQLLQS 

ELFQNSGNVNLTLQMKIIEQEKEIAELKKQLNLL 

SQDKGVRDDGKJDGGVG 


3035 


A 


110 


1172 


KLSCPCSHGTRVTAVRGPRLKAGVQWHDLGSLQ 

PPPSGLKQSSHLSLSSSWDFRHAPTHPETYTCPK 

MIEMEQAEAQLAELDLLASMFPGENELIVNDQL 

AVAELKDCmKKTMEGRSSKVYTTINMNLDVSD 

EKMAMFSLACILPFKYPAVLPEITVRSVLLSRSQQ 

TQLNTDLTAFLQKHCHGDVCILNATEWVREHAS 

A 7CT1 HTP P O TITT^ PTl f A P "I JIM T 1" >r T'T> T Tt TT\ rOT TT XT\T ' 

GYVSRDTSSSPTTGSWQSVDLIFTR^ 

NKCKRKNILEWAKELSLSGFSMPGKPGWCVEG 

PQSACEEFWARLRKLNWKRILIRHREDIPFDGTN 

DETERQRKFSIFEEKVFSVNGARGNHMDFGQLY 

QFLNTKGCGDVFQMFLWV 


3036 


A 


1 


2288 


FRFAERRAAAAESDVSAKMAGRSMQAARCPTD 

ELSLTNCAWNEKDFQSGQHVIVRTSPNHRYTFT 

LKTHPSWPGSIAFSLPQRKWAGLSIGQEIEVSLY 

TFDKAKQCIGTMTIEIDFLQKKSIDSNPYDTDKM 

AAEFIQQFNNQAFSVGQQLVFSFNEKLFGLLVKD 

IEAMDPSILNGEPATGKRQKIEVGLVVGNSQVAF 

EKAENSSLNLIGKAKTKENRQSnNPDWNFEKMG 

IGGLDKEFSDIFRRAFASRVFPPEIVEQMGCKHVK 

GILLYGPPGCGKTLLARQIGKMLNAREPKWNG 

PE1LNKYVGESEANIRKLFADAEEEQRRLGANSG 

LHIIIFDEIDAICKQRGSMAGSTGVHDTVVNQLLS 

KIDGVEQLNNILVIGMTNRPDLIDEALLRPGRLEV 

KMEIGLPDEKGRLQILHIHTARMRGHQLLSADV 

DIKELAVETKNFSGAELEGLVRAAQSTAMNRHI • 

A QTVVPArnA/lTrV A T7CT rw/TT? r:TM7T A QT T7XTTiTVT> 
JSJio 1 is. V H V UlVlErJvAlioi^ Kl V 1 KOJJr LAoUlIN UUsJr 

AFGTNQEDYASYIMNGIIKWGDPVTRVLDDGEL 
LVQQTKNSDRTPLVSVLLEGPPHSGKTALAAKIA 
EESNFPFmCSPDKMIGFSETAKCQAMKKIFDDA 
YKSQLSCVVVDDIERLLDYVP1GPRFSNLVLQAL 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

lUCUUUU 

corresponding 
to first amino 
acid residue of 
peptide 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc OCystcinc, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Clycine, H=Histidine, 
I=Isoleucine, KpLysine, I^=Leucine, M=Mcthiomne, 
r\=Asparagme, r— Jrroiine, v^— ijiuiamine, K=Arginme, o^oenne, 
T=Threonine, V=VaJine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possibIe nucleotide insertion 










LVLLKKAPPQGRKLLHGTTSRKDVLQEMEMLNA 
FSTTIHVPNIATGEQLLE ALELLGNFKDKERTTI A 
QQVKGKKVWIGIKKLLMLIEMSLQMDPEYRVRK 
FLALLREEGASPLDFD 


3037 


A 


1 


1347 


MLDTGSEHLNRJLICALPALQSAGSEGQNGSAESL 

GEGGTRDSDRARRKLRGGNKEIPTFYPCLVVRSP 

VTASDLRGTQDFAAYHGLSLILEPLGACNRLSVC 

VPVHSPPGMRVSPRSPSLRTLVIDPAEPAGAQRL 

RFSGKERSGEAGSAVEGLAVAVSMGDGGAERD 

RGPARRAESGGGGGRCGDRSGAGDLRADGGGH 

SPTEVAGTSASSPAGSRESGADSDGQPGPGEADH 

CRRILVRDAKGT1REIVLPKGLDLDRPKRTRTFFT 

AEQLYRLEMEFQRCQYVVGRERTELARQLNLSE 

FATSNILRLLEQGRLLSVPRAPSLLALTPSLPGLP 
ASHRGTSLGDPRNSSPRLNPLSSASASPPLPPPLP 
AVCFSSAPLLDLPAGYELGSSAFEPYSWLERKVG 
SASSCKKANT 


3038 


A 


974 


SOI 


RLETAPSLLLSRMACVISGWALSRGARTWTWAT 
PTGPVHRAQPAIRSLSAEGALTRLKEEKWPGRYI 
LPNHLTPPFLYKrE.GSVPPSHWRSPLISHSVNILA 
LNWR 


3039 


A 


1263 


111 


ACGIRHEGALPGLTATPEAMLRFLPDLAFSFLLIL 

ALGQAVQFQEYVFLQFLGLDKAPSPQKFQPVPYI 

LKKDFQDREAAATTGVSRDLCYVKELGVRGNVL 

RFLPDQGFFLYPKKISQASSCLQKLLYFNLSAIKE 

REQLTLAQLGLDLGPNSYYNLGPELELALFLVQE 

PHVWGQTTPKPGKMFVLRSVPWPQGAVHFNLL 

DVAKD WNDNPRKNFGLFLEIL VKEDRDSG VNFQ 

PFHTPART PfCJT MART T WTT "WPnnPWP^P'RTP'R A 

AffWKLSCKmCHRHQLFINFRDLGWHKWIIAP 
KGFMANYCHGECPFSLTISLNSSNYAFMQALMH 
AVDPEIPQAVCIPTXLSPISMLYQDNNDNVILRHY 
EDMVVDECGCG 


3040 


A 


15 


849 


ASRLPRGPGCGADMRPLLGLLLVFAGCTFALYL 
LSTRLPRGRRLGSTEEAGGRSLWFPSDLAELREL 
SEVLREYRKEHQAYVFLLFCGAYLYKQGFAIPGS 
SFLNVLAGALFGPWLGLLLCCVLTSVGATCCYL 

JLrOiJJLT VJJVV<Jl-» V Voir i J-/XV V /VJL»JL<\^I\J\. V JDClNIVJN Oi-zFT 

FLLFLRLFPMTPNWFLNLSAPILNIPIVQFFFSVLI 
GLIPYNFICVQTGSILSTLTSLDALFSWDTVFKLL 
AIAMVALIPGTLnCKFSQIOILQLNETSTANHIHSR 
KDT 


3041 


A, 


1015 


175 


GLKRRRLCFAKVGDVLGCLSLPPSRSARVLEDISI 
LSCISVDSRIVRTKVPCSVTMSRPRKRLAGTSGSD 
KGLSGKRTKTENSGEALAKVEDSNPQKTSATKN 
CLKNLSSHWLMKSEPESRLEKGVDVKFSIEDLKA 

OPKOTTPWnnVPTsIVn AR"MFT P A\AY T OFFAFFV 

HSNCKEPGIAGLMKIVKEAYPDHTQFEKNNPHY 
DPSSKEDNPKWSMVDVQFVRMMPCRFIPLAELKS 
YHQAHKATGGPLKNMVLFTRQRLSIQPLTQEEF 
DFVLSLEEKEPS 


3042 


A 


1015 


175 


GLKRRRLCFAKVGDVLGCLSLPPSRSARVLEDISI 
LSCISVDSRIVRTKVPCSVTMSRPRKRLAGTSGSD 
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SEQDD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A*=Alanine C=Cysteine, D=Aspartic Acid, 
E^Glutamic Acid, F-Phenylalanine, G=GIycine, H-Histidine, 
I=Isoleucine, K^Lysine, L*=Leucinc ( M=Methionine, 
N=Asparagine, P=Proline, Q^GIutamine, R=Arginine, S=Serine, 
T^Threonine, V= Valine, W«Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /=possib!e nucleotide deletion, 
^possible nucleotide insertion 










KGLSGKRTKTENSGEALAKVEDSNPQKTSATKN 

GLKNLSSHWLMKSEPESRLEKGVDVKFSIEDLKA 

QPKQTTCWDGVRNYQARNFLRAMKLGEEAFFY 

HSNCKEPGIAGLMKIVKEAYPDHTQFEKNNPHY 

DPSSKEDNPKWSMVDVQFVRMMKRFIPLAELKS 

YHQAHKATGGPLKNMVLFTRQRLSIQPLTQEEF 

DFVLSLEEKEPS 


3043 


A 


153 


1133 


VGTAPAPGGRDRAPAMGSFQLEDFAAGWIGGA 

ASVIVGHPLDTVKTRLQAGVGyGNTLSCIRVVY 

RRESMFGFFKGMSFPLASIAVYNSVVFGVFSNTQ 

RFLSQHRCGEPEASPPRTLSDLLLASMVAGWSV 

GLGGPVDLIKIRLQMQTQPFRDANLGLKSRAVAP 

AEQPAYQGP VHCITTIVRNEGLAGLYRGASAML 

LRDVPGYCLYFEPYVFLSEWITPEACTGPSPCAV 

WLAGGMAGAISWGTATPMDVVKSRLQADGVY 

LNKYKGVLDCISQSYQKEGLKVFFRGITVNAVR 

GFPMSAAMFLGYELSLQAIRGDHAVTSP 


3044 


A 


41 


1316 


PPLGAGAGIHARSPHPARRLRLTAAGVGGRASG 
LLPTPWRRHHGPSGAAPYPAARLWQGPWRCRR 
PQPMAQRYDELPHYPGIADGPAALAGFPEAVPA 
APGPYGPHRPPQPLPPGLDSDGLKRDKDEIYGHP 
LFPLLALGFEKCELATCSPRDGAGAGLGTPRGGD 
VCSSDSFNEDNTAFAKQVCSERPFSSNPELDNLM 
. IQAIQVLRFHLLELEKGKMPIDLV1EDRDGGCRE 
DFEDYPAPCPSLPDQNNIWIRDHEDSGSVHLGTP 
GPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGE 

dedldqeprrnkkrgifpkva™mrawlfqhl 
shpypseeqkkqlaqdtglmqwnwfinarrr 
ivqpmidqsnrtgqgaafspegqpiggyteteph 
vafrapasvgmslnsegewhyl 


3045 


A 


3 


967 


VAHTQWHTCQRLSQLTHRSILKYLLIDTHACQV 

LILKHTHASLSLPSCQECFPSSIPSASHMVSHPHPP 

PSPRWGQTPEGLPAASPCGPGPRSCFSSILPTGDS 

WGMLACLCTVLWHLPAVPALNRTGDPGPGPSIQ 

KTYDLTRYLEHQLRSLAGTYLNYLGPPFNEPDFN 

PPRLGAETLPRATVDLEVWRSLNDKLRLTQNYE 

AYSHLLCYLRGLNRQAATAELRRSLAHFCTSLQ 

GLLGSIAGVMAALGYPLPQPLPGTEPTWTPGPAH 

SDFLQKMDDFWLLKELQTWLWRSAKDFNRLKK 

KMQPPAAAVTLHLGAHGF 


3046 


A. 


1185 


1584 


MYAYMYICTfflCICAYRGIHIDVYLYMCIYIHIWI 
HTYLCVHIYVYVYICTHICMCIHTYVYVYTYMY 
VYTYICLCVYlCLCVHIYLCVYIHNrYMGTHICMC 
IHTYVHMCICVYIHMYTCVYVYTYTCVYMY 


3047 


A 


811 


132 


SLDLLGPIGILQEGRDPGTQGPQEICEKQMPASPM 

NTDAHLD1NFKEGLKKERSYTGQFEANVRDEER 

QCGCGVVPDSLLMKVLSQRLDQQDCIQKGWVL 

HGVPRDLDQAHLLNRLGYNPNREFFLNVPFDSI 

MERLTLRRIDPVTGERYHLMYKPPPTMEIQARLL 

QOTKDAEEQVKLKMDLFYRNSADLEQLYGSAIT 

t xi r: r^n rvp vt\ rpu vrc c n. ttktdt dfvtd 
JLrsulA^i->r 1 1 VrliyjjioOUlNrLrJSJsJ.r 


3048 


A.. 


2 


1166 


RPRRGQGLVQEVQTENVTVAEGGVAEJTCRLHQ 
YDGSIWIQNPARQTLFFNGTRALKDERFQLEEFS 
PRRVRIRLSDARLEDEGGYFCQLYTEDTHHQIAT 
LTVLVAPENPVVEVREQAVEGGEVELSCLVPRSR 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A~Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, C=GIycine, H=Histidine, 
I=Isoleucine, K=Lysine, L»Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threoninc, V=Valine, W^Tryptophan, Y^Tyrosine, 
X=Un known, *=Stop codon, /=possible nucleotide deletion, 
^possible nucleotide insertion 










PAATLRWYRDRKELKGVSSSQENGKVWSVAST . 

VRFRVDRKDDGGIIICEAQNQALPSGHSKQTQYV 

LDVQYSPTARIHASQAWREGDTLVLTCAVTGN 

Pl^NQIRWNRGNESLPERAEAVGETLTLPGLVSA 

DNGTYTCEASNKHGHARALYVLVVYGESRLRPT 

EGGGGAPDPGAVVEAQTSVPYAIVGGILALLVFL 

IICVLVGMVWCSVRQKGSYLTHEASGLDEQGEA 

REAFLNGSDGHKRKEEFFI 


3049 


A 


3159 


882 


VGCTLRVGVMAAAGSRKRRLAELTVDEFLASGF 

DSESESESENSPQAETREAREAARSPDKPGGSPSA 

SRRKGRASEHKDQLSRLKDRDPEFYKFLQENDQ 

SLLNFSDSDSSEEEEGPFHSLPDVLEEASEEEDGA 

EEGEDGDRVPRGLKGKKNSVPVTVAMVERWKQ 

AAKQRLTPKLFHEVVQAFRAAVATTRGDQESAE 

ANICFQVTDSAAFNALVTFCIRDLIGCLQKLLFGK 

VAKDSSRMLQPSSSPLWGKLRVDIKAYLGSAIQL 

VSCLSETTVLAAVLRHISVLVPCFLTFPKQCRML 

LKRMVVVWSTGEESLRVLAFLVLSRVCRHKKDT 

FLGPVLKQMYITYVRNCKFTSPGALPFISFMQWT 

LTELLALEPG VAYQrL^FLYmQLAIHLRNAMTTR , 

KKETYQSVYNWQYVHCLFLWCRVLSTAGPSEA 

LQPLVYPLAQVnGCIKLIPTARFYPLRMHCIRALT 

LLSGSSGAFIPVLPFmEMFQQVDFmKPGRMSSK 

PINFSVILKLSNVNLQEKAYRDGLVEQLYDLTLE 

YLHSQAHCIGFPELVLPWLQLKSFIJIECKVANY 

CRQVQQLLGKVQENSAYICSRRQRVSFGVSEQQ 

AVEAWEKXTREEGTPLTLYYSHWRKLRDREIQL 

EISGKERLEDLNFPEIKRRXMADRKDEDRKQFKD 

LFDLNSSEEDDTEGFSERGILRPLSTRHGVEDDEE 

DEEEGEEDSSNSEDGDPDAEAGLAPGELQQLAQ 

GPEDELEDLQtSEDD . 


3050 


A 


870 


182 


HLDRYIKSPGSGSSTPAPPSHLLLYLLHPQSTRTM 

GCCGCSRGCGSGCGGCGSSCGGCGSGCGGCGSG 

RGGCGSGCGGCSSSCGGCGSRCYVPVCCCKPVC 

SWVPACSCTSCGSCGGSKGGCGSCGGSKGGCGS 

CGCSQSSCCKPCCCSSGCGSSCSQSSCCKPCCCSS 

GCGSSCCQSSCCKPYCCQSSCCKPCSCFSGCGSS 

CCQSSCYKPCCCQSSCCVPVCCQCKI 


3051 


A 


175 


4330 


NIPRWNFQGKSFGWLVHFSSEEVDMASDSPARS 

LDEIDLSALRDPAGIFELVELVGNGTYGQVYKGR 

HVKTGQLAA1XVMDVTGDEEEE1XQE1NMLKKY 

SHHRNIATYYGAFIKK^PGMDDQLWLVMEFCG 

AGSVTDL1XNTKGYTLKEEWIAYICREILRGLSHL 

HQHKVIHRDIKGQNVLLTENAEVKLXODFGVSAQ 

LDRTVGRKNTFIGTPYWMAPEYIACDENPDATY 

DFKSDLWSLGITAIEMAEGAPPLCDMHPMRALF 

LIPRIsfPAPRLKSKKWSKKFQSFmSCLVKKHSQRP 

ATEQLMKHPFIRDQPNERQVRJQLIODHIDRTKKK 

RGEKDETEYEYSGSEEEEEENDSGEPSSILNLPGE 

STLRRDFLRLQLANKERSEALRRQQLEQQQREN 

LRKQQEREQRRHYEEQMRREEERRRAEHEQEYI 
RRQLEEEQRQLEILQQQLLHEQALLLEYKRKQLE 
EQRQAERLQRQLKQERDYLVSLQHQRQEQRPVE 
KKPLYHYKEGMSPSEKPAWAKEVEERSRLNRQS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C~Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K^Lysine, LHLcucine, M=Mcthioninc, 
N^Asparagine, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosinc, 
X=Un known, *=Stop codon, /-possible nucleotide deletion, 
^possible nucleotide insertion 










SPAMPHKVANRISDPNLPPRSESFSISGVQPARTP ■ 

PMLRPVDPQEPHLVAVKSQGPALTASQSVHEQPT 

KGLSGFQEALNVTSHRVEMPRQNSDPTSENPPLP 

TRJEKFDRSSWLRQEEDIPPKVPQRTTSISPALAR 

KNSPGNGSALGPRLGSQPIRASNPDLRRTEPILES 

PLQRTSSGSSSSSSTPSSQPSSQGGSQPGSQAGSSE 

RTRVRANSKSEGSPVLPHEPAKVKPEESRDITRPS 

RPASYKKAIDEDLTALAKELRELRIEETNRPMKK 

VTDYSSSSEESESSEEEEEDGESETHDGTVAVSDI 

PRLIPTGAPGSNEQYNVGMVGTHGLETSHADSFS 

GSISREGTLMIRETSGEKKRSGHSDSNGFAGHINL 

PDLVQQSHSPAGTPTEGLGRVSTHSQEMDSGTE 

YGMGSSTKASFTPFVDPRVYQTSPTDEDEEDEES 

SAAALFTSELLRQEQAKLNEARKISVVNVNP1NI 

RPHSDTPEIRKYKKRFNSEILCAALWGVNLLVGT 

ENGLMLLDRSGQGKVYNLINRRRFQQMDVLEG 

LNVLVTISGKKNKLRVYYLSWLRNRILHNDPEV 

EKKQGWITVGDLEGCIrTYXVVKYERIKFLVIALK 

NA VEIYA WAPKPYHKFMAFKSFADLQHKPLLVD 

Li Vbbut^KbKVlrObHIUrHVlDV^ 

SHIQGNITPHAIVILPKTDGMEMLVCYEDEGVYV 

NTYGRITKDVVLQWGEMPTSVAYIHSNQIMGW ' 

GEKAIEIRSVETGHLDGVFMHKRAQRLKFLCERN 

DKVFFASVRSGGSSQVFFMTLNRNSMMNW 


3052 


A 


1 


615 


. MGQVECGGQKLGNQLEDDSEPAEGKVYSSDEE 

KLEASAGDPAGSEQEEEGSGGDSEDDGFLDSSA 

ourOALLOrivrJvLKObLO J OAbbuArV 1 AOV 1 A 

PGGKSRRRRTAFTSEQLLELEKEFHCKKYLSLTE 

RSQIAHALKLSEVQVKIWFQNRRAKWKRIKAGN 

VSSRSGEPVItKPKIWPIPVHVNRFAVRSQHQQM 

EQGARP 


3053 


A ' 


203, 


2167 


FGVRVPSNTQCLVPSFHCMQTSEWDSECLTSLQP 

LPLPTPPAANEAHLQTAAISLWTVVAAVQAIERK 

VEIHSRRLLHLEGRTGTAEKKLASCEKTVTELGN 

QLEGKGAVLGTLLQEYGLLQRRLENLENLLRKR 

OTWILRLPPGIKGDIPKVPVAFDDVSIYFSTPEWE 

KLEEWQKELYKNIMKGNYESLISMDYAINQPDV 

LSQIQPEGEHNTEDQAGPEESEIPTDPSEEPGISTS 

DILSWIKQEEEPQVGAPPESKESDVYKSTYADEE 

LVIKAEGLARSSLCPEVPVPFSSPPAAAKDAFSDV 

AFKSQQSTSMTPFGRPATDLPEASEGQVTFTQLG 

SYPLPPPVGEQVFSCHHCGKNLSQDMLLTHQCS 

HATEHPLPCAQCPKHFTPQADLSSTSQDHASETP 

PTCPHCARTFTHPSRLTYHLRVHNSTERPFPCPDC 

PKRFADQARLTSHRRAHASERPFRCAQCGRSFSL 

KISLLLHQRGHAQERPFSCPQCGIDFNGHSALIRH 

/"W/fTl-J'TY'i'E'D T>V'P<'~ , TTV , " > CI/ QT?AA~D V"CUT T XTLTD"DT "LIT 

GERPFSCPHCGKSFIRKHHLN1KHQRIHTGERPYP 
CSYCGRSFRYKQTLKDHLRSGHNGGCGGDSDPS 
GQPPNPPGPLITGLETSGLGVNTEGLETNQWYGE 
GSGGGVL 


3054 


A 


3 


2212 


SCGHKSAYGSYTGLQLFWEDGQELLQHQQLQD 
LRLCVHLRPQSEKVELSLWTLFWGKGEPSAVR 
EKLGKAGFAAASGPGGRPGAERASTVLNILHLT 
AESRWEPNACNRVSSSPAGVGPLDLPVGPLLYFF 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D~Aspartic Acid, 
E=Glutamic Acid, F=PhenyIaIanine, G=Glycine, H=Histidine, 
I»Isoleucine, K=Lysine, L*=Leucine, M=Methioninc, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S»Serine, 
T=Threonine, V=VaIine, \V=Tryptophan, Y=Tyrosinc, 
X=Unknown, * Bs Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










APWARASFLCHAFQIU>LTGIGLNTVRFTSEFPLH 

SKDPTAHKLLFTGNYLCKLHPRPRHAPQGSLSDF 

CHGTEGKDLPSEHNVSVEGVAQDRSPEATLCPQ 

KTCPCDICGLRLKDILHLAEHQTTHPRQKPFVCE 

AYVKGSEFSANLPRKQVQQNVHNPIRTEEGQAS 

PVKTCRDHTSDQLSTCREGGKDFVATAGFLQCE 

VTPSDGEPHEATEGWDFHIALRHNKCCESGDAF 

NNKSTLVQHQRIHSRERPYECSKCGIFFTYAADL 

TQHQKVHNRGKPYECCECGKFFSQHSSLVKHRR 

VHTGESPHVCGDCGKFFSRS SNLIQHKRVHTGEK 

PYECSDCGKFFSQRSNLIHHKRVHTGRSAHECSE 

CGKSFNCNSSLIKHWRVHTGERPYKCNECGICFFS 

HIASLIQHQIVHTGERPHGCGECGKAFIRSSDLMK 

liVjKVJdl lOJc-Kr YbCNbCOKLr D(jbbSLNSHRRLHT 

GERPYQCSECGBGFFNQSSSLNNHRRLHTGERPYE 

CSECGKTFRQRSNLRQHLKVHKPDRPYECSECG 

KAFNQRPTLIRHQKIHIRERSMENVLLPCSQHTPE 

ISSENRPYQGAVNYKLKLVHPSTHPGEVP 


3055 


A 


268 


2954 

- • 


ARRSSSSQGSAAPTPCQVVEASRDQLVAGPSGK 

MGNREMEELIPLVNRLQDAFSALGQSCLLELPQI 

AVVGGQSAGKSSVLENFVGRDFLPRGSGIVTRRP 

LVLQLVTSKAEYAEFLHCKGKKFTDFDEVRLEIE 

AETDRVTGMNKGISSIPINLRVYSPHVLNLTLIDL 

PGITKVPVGDQPPDIEYQIRMIMQRTRENCLTLA 

VTPANTDLANSDALKLAKEVDPQGLRTIGVITKL 

DLMDEGTDARDVLENKLLPLRRG YVGVVNRSQ 

KDIDGKKDIKAAMLAERJKJFLSHPAYRH1ADRM . 

GTPHLQKVLNQQLTNHIRDTLPNFRNKLQGQLLS 

IEHEVEAYKNFKPEDPTRKTKALLQMVQQFAVD 

FEKRIEGSGDQVDTLELSGGAKINRlFHERFPFErV 

KMEFNEICELRREISYAIKNIHGIRTGLFTTDMAFE 

AIVKKQIVKLKGPSLKSVDLVIQELINTVKKCTK 

KLANFPRLCEETERWANHIREREGKTKDQVLLLI 

DIQVSYINTNHEDFIGFANAQQRSSQVHKKTTVG 

NQVIRKGWLTISNIGIMKGGSKGYWFVLTAESLS 

WYKDDEEKEKKYMLPLDNLKVRDVEKSFMSSK 

mFALFNTEQRNVYKDYRFLELACDSQEDVDSW 

KASLLRAGVYPDKSVGNNKAENDENGQAENFS 

MDPQLERQVETIRNLVDSYMSIINKCIRDLIPKTI 

Mffl.MINNVKDFINSELLAQLYSSEDQNTLMEES 

Afc<^ A^JtUUJbMJLKM Y Q ALKEALG 1IGDIGTATVS 

TPAPPPVDDSWIQHSRRSPPPSPTTQRRPTLSAPL 

Axvr l auRur ArAlr ur nouAr r V rrKrur Lrrr r 

SSSDSFGAPPQVPSRPTRAPPSVPSRRPPPSPTRPTI 


3056 


A 


1674 


1839 


VVRVTCCPPARSTTERTNAYDEEDCVEMVASGG 
WNDVACHTTMYFMCEFDKKNM 


3057 


A. 


1674 


1839 


WRVTCCPPARSTTERTNAYDEEDCVEMVASGG 
WMDVACrTTTMYFMCEFDKKNM 


3058 


A 


3363 


2525 


FLVKLILIILCRCLHSLSRSVQQLRTSFQDHAVWK 

PT M1CVT ONAPTYFTT WA^Ml PTJT T T PFQPQT^PPT 

LESGAVELLCGLTQSENPALRVNGIWALMNMAF 
QAEQKIKADELRSLSTEQLFRLLSDSDLNVLMKT 
LGLLRNLLStRPHEDKIMSTHGKQIMQAVTLILEG 
EHNIEVKEQTLCILAN1ADGTTAKDLIMTNDDILQ 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location - 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIaninc OCystcinc, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G^Glycine, H-Histidine, 
I=lsoIeucine, K^Lysine, L^Leucine, M=Methionine, 
N=Asparagine; P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Thrconine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










KIKYYMGHSHVKLQLAAMFCISNLIWNEEEGSQ 
ERQDKLRDMGIVDILHKLSQSPDSNLCDKAKMA 
LQQYLA 


3059 


A 




I O / 


ooW r oi^oo^ivlhir Jr £>r JtiLJri V AArl Y LrKDbr VJRJLLLE 

FKAEVDPLSDKGTTPLQLAIIRERSSCVKILLDHN 

ANIDIQNGFLLRYAVIKSNHSYCRMFLQRGADTN 
T GRT FDGOTPT WT QAT ttfYnVT PAPA/TT vxrw: An 

TNTRNYEGQTPLAVSISISGSSRPCLDFLQEVTSM 


3060 


A 


30 


234 


PPLQLDMDPNCYCADGDSCTCAGSCKCKECKCT 
o^JSJS.o\^j^ov^v^Jr AUi^AJ^^A^O^Il^lMjrA 1 JJJvL/oCC 
A 


3061 


A 


428 


720 


VRRDVRQQATWAMASDLDFSPPEVPEPTFLENL 

LRYGLFLGAIFQLICVLAIIVPIPKSHEAEAEPSEPR 

SAEVTRKPKAAWSVNKRPKKETKKKR 


3062 


A 


1589 


276 


WKQKYEPLGLDAAGIEEAITAVGSFILKANELLQ 

VIDSSMKOTKAPTRWLYVAMLRMTEDHVLPELN 

KMTQKDITFVAEFLTEHFNEAPDLYNRKGKYFN 

VERVGQYLKDEDDDLVSPPNTEGNQWYDFLQN 

SSHLKESPLLFPYYPRKSLHFVKRRMENIIDQCLQ 

KPADVIGKSMNQAICIPLYRDTRSEDSTRRLFKFP 

FLWNNKTSNLHYLLFTILEDSLYKMCILRRHTDIS 

QSVSN'GLIAKFGSFTYATTEKVRRSrYSCLDAQF 

YUUJb I V 1 V VLKDI VCjKJbORJjRJ^LVQLPLSLVYN 

SEDSAEYQFTGTYSTRLDEQCSAIPTRTMHFEKH 

WRLLESMKAQYVAGNGFRKVSCVLSSNLRHVR 

. VFEMDIDDEWELDESSDEEEEASNKPVKIKEEVL 

SESEAENQQAGAAALAPEIVIKVEKLDPELDS 


3063 


A 


50 


849 


DKMPSIFAYQSSEVDWCESNFQYSELVAEFYNTF 

SNIPFFIFGPLMMLLMHPYAQKRSRYTYVVWVLF 

MnGLFSMYFHMTLSFLGQLLDEIAELWLLGSGYS 

i WMjrKi^irrbrLUuNlO 1 VVb 1LLSFL 

RPTVNAYALNSIALHILYIVCQEYRKTSNKELRH 

LIEVSVVLWAVALTSWISDRLLCSFWQRIHFFYL 

HSIWHVLISITFPYGMVTMALVDANYEMPGETL 

KVRYWPRDSWPVGLPYVEIRGDDKDC 


3064 


A 


1523 


925 


AATMADGQMPFSCHYPSRLRRDPFRDSPLSSRLL 

uu\jr\j]yLUrrrUL)Li 1 Ao WrJJ W A_LrKLooA W rOTL 

RSGMVPRGPTATARFGVPAEGRTPPPFPGEPWK 

VCVTWHSFKPEELMVKTKDGYVEVSGKHEEKQ 

QEGGIVSKNFTKKIQLPAEVDPVTVFASLSPEGLL 

IIEAPQVPPYSTFGESSFNNELPQDSQEVTCT 


3065 


A 


230 


2929 


LSTSLTGSHLFSLGNHSTRENLNAGNFNFPSEGH 

LVRSTGPGGSFAKHMVAQCVSPKGPLACSRTYF 

FGATHVPYLGGDSICLPKKTEQIRLLSQIYAAVIE 

AVLAGIACYAKTSSLTKAKEVAEQTLGSGLDSFE 

LIPFKAALRSKMTFHIHAVNNQGRIVPLDSEDSLS 

FVKTACMAVYDIPDLLGGNGCLGSVVFSESFLTS 

QILVKEKDGTVTTETSSVVLTAAVPRFCSWLVED 

NEVKLSEKTHQAVRGDESFLGTYLTGGEGAYLY 

SSNLQSWPEEGNVHFFSSGLLFSHCRHGSfflSKD 

HMNSISFYDGDSTSTVAALLTDFK^T T PHT PVKF 

HGSSNFLMIALFPKSKJYQAFYSEVFSLWKQQDN 
SGISLKVIQEDGLSVEQKRLHSSAQKLFSALSQPA 
GEKRSSLKLLSAKLPELDWFLQHFAISSISQEPVM 
RTHLPVLLQQAEmTTHRIESDKVIISrVTGLPGCH 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
EKSlutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine» 
I=Isoleucine, K=Lysine, Lr=Leucine, M=Methiooine, 
N«Asparagine, P«ProIine, Q=Glutamine, R=Arginine, S=Scrine, 
T^Threonine, V^Valine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *«Stop codon, /^possible nucleotide deletion, 
V=possibIe nucleotide insertion 










ASELCAFLVTLHKECGRWMVYRQIMDSSECFHA 

AHFQRYLSSALEAQQNRSARQSAYIRKKTRLLV 

VLQGYTDVIDWQALQTHPDSNVKASFTIGAITA 

CVEPMSCYMEHRPLFPKCLDQCSQGLVSNWFT 

SHTTEQRHPLLVQLQSLIRAANPAAAFILAENGrV 

TRNEDIELILSENSFSSPEMLRSRYLMYPGWYEG 

KLNAGSVYPLMVQICVWFGRPLEKTRFVAKCKA 

IQSSIKPSPFSGN1YH1LGKVKFSDSERTMEVCYNT 

LAN bLblMP V LEur 1 PPPDoKb VSQDSSGQQECYL 

VFIGCSLKEDSIKDWLRQSAKQKPQRKALKTRG 

MLTQQEIRSIHVKRHLEPLPAGYFYNGTQFVNFF 

GDKTDFHPLMDQFMNDYVEEANREBEKYNQELE 

QQEYHDLFELKP 


JUDO 


A 




f DO 


T A T5T T"> / m *r\T%/~ i T r n T , / - M~>T> OT TTi A A XTTVTfcO A A > /n a A /™i A T"» 

LArLKCQPO rRTQPRSHPAANDPSAAMSAAG AR 

GLRATYHRLLDKVELMLPEKLRPLYNHPAGPRT 

WFWAPIMKWGLVCAGLADMARPAEKLSTAQS 

AVLMATGFIWSRYSLVIIPKNWSLFAVNFFVGAA 

GASQLFRIWRYNQELKAKAHK 


3067 


A 


2 


1016 


EFARRRVF1AAREMSLLRSLRVFLVARTGSYPAG 

SLLRQSPQPRHTFYAGPRLSASASSKELLMKLRR 

KTGYSFVNCKKALETCGGDLKQAEIWLHKEAQ' 

ICEGWSKAAKiQGRKTKEGLIGLLQEGNTTVLVE 

VNCETDFVSRNLKFQLLVQQVALGTMMHCQTL 

KDQPSAYSKGFLNSSELSGLPAGPDREGSLKDQL 

ALAIGKLGENMILKRAAWVKWSGFYVGS 

AMQSPSLHKLVLGKYGALVICETSEQKTNLEDV 

GRRLGQHWGMAPLSVGSLDDEPGGEAETKML 

SQPYLLDPSITLGQYVQPQGVSWDFVRFECGEG 

EEAAETE 


3068 


A 


3 


1679 


NSRVWGPWTEPSAGSLRPMARKQNRNSKELGL 

VPLTDDTSHAGPPGPGRALLECDHLRSG VPGGR 

RRKDWSCSLLVASLAGAFGSSFLYGYNLSVVNA 

PTPYIKAFYNESWERRHGRPIDPDTLTLLWSVTV 

SIFAIGGLVGTLIVKMIGKVLGRKHTLLANNGFAI 

SAALLMACSLQAGAFEMLIVGRFIMGIDGGVALS 

VLPMYLSEISPKEIRGSLGQVTAIFICIGVFTGQLL 

GLPELLGl^STWPYLFGVIVWAVVQLLSLPFLP 

DSPRYLLLEKHNEARAVI^FQTFLGKaDVSQEV 

EEVLAESRVQRSIRLVSVLELLRAPYVRWQVVT 

VIVTMACYQLCGLNAI\WYTOSn?GKAGIPPAKIP 

YVTLSTGGIETLAAVFSGLV1EHLGRRPLLIGGFG 

JLMuLr r u I L i I 1 L 1 LQDHAr W VrYLSIVGJLAIIAS 

FCSGPGGIPFILTGEFFQQSQl^AAFnAGTVNWLS 

NFAVGLLFPFIQKSLDTYCFLVFATICITGAIYLYF 

VLPETKNRTYAEISQAFSKRNKAYPPEEKIDSAV 

TDGKINGRP 


3069 


A 


861 


300 


AAGAVVSAMPKAKGKTRRQKFGYSVNRKRLNR 

XT A DDI/ A A T5D TCPOTTTT* TT A TI 7T*\TT A TV 7T> /~\"VTT A T^H K /~~* 

LAVDPNRAVPLRXRKVKAMEVDIEERPKELVRK 
PYVLNDLEAEASLPEKKGNTLSRDLIDYVRYMV 
ENHGEDYKAMARDFKNYYODTPK OfR SKTMVY 
KRFYPAEWQDFLDSLQKRICMEVE 


3070 


A 


325 


2019 


LAEPEVATDSGQQADLPAEGGDPRAEASCSVLH 
SKPHAMADSRDPASDQMQHWKEQRAAQKADV 
LTTGAGNPVGDKLNVITVGPRGPLLVQDVVFTD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, OGIycine, H^Histidinc, 
I=Isotcucine, K=Lysinc, L=Lcucine, M-Mcthionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Argininc, S^Serine, 
T=Threoninc, V=VaIine, W-Tryptophan, Y-Tyrosine, 
. X=Un known, *=*Stop codon, A=possible nucleotide deletion, 
\=possibie nucleotide insertion 










EMAHFDRERIPER WHAKGAG AFG YFE VTHDIT 

KYSKAKVFEHIGKKTPIA VRFSTVAGESGSADTV 

RDPRGFAVKFYTEDGNWDLVGNNTPIFFIRDPILF 

PSFfflSQKRNPQTHLKDPDMVWDFWSLRPESLH 

QVSFLFSDRGIPDGHRHMNGYGSHTFKLVNANG 

EAVYCKFHYKTIXJGIKNLSVEDAARLSQEDPDY 

GIRDLFNAIATGKYPSWTFYIQVMTFNQAETFPF 

NProLTKVWPHKDYPLIPVGKLVLNRNPVNYFA 

EVEQIAFDPSNMPPGIEASPDKMLQGRLFAYPDT 

HRHRLGPNYLHIPVNCPYRARVANYQRDGPMC 

MyUNyvjuAPNYYPNSFGAPEQQPSALEHSIQYS 

GEVRRFNTANDD>TVTQVRAFYVNVLNEEQRJCR 

LCEN1AGHLKDAQIFIQKKAVKNFTEVHPDYGSH 

IQALLDKYNAEBCPKNAIHTFVQSGSHLAAREKA 

NL 


3071 


A 


1 

A 


1187 


SLGWLERPPALSRAAGDGARRLSGSRRGDVWLT 

SSAAGLLRSVAGGSWCGGQLRARGGSGRCVAR 

AMTGNAGEWCLMESDPGVFTELIKGFGCRGAQ 

VEEIWSLEPENFEKLKPVHGLIFLFKWQPGEEPA 

GSVVQDSRLDTIFFAKQVINNACATQAIVSVLLN 

CTHQDVHLGETLSEFKEFSQSFDAAMKGLALSN 

SDVIRQVHNSFARQQMFEFDTKTSAKEEDAFHF 

VSYVPVNGRLYELDGLREGPIDLGACNQDDWIS 

AVRPVIEKRJQKYSEGEIRFNLMAlVSDRKMr\ r EQ 

KIAELQRQLAEEEPMDTDQGMSMLSAIQSEVAK ^ 

NQMLmEEVQKLKRYKIENIRRKHNYLPFIMELL 

KTLAEHQQLIPLVEKAKEKQNAKKAQETK 


3072 


A 


103 . 


2775 


RLRTLAPPGLLLGPPLVPDSRRRHQASLTPLHISG 

SPQLVGRGDRKLRTEVLVPPAALPAETRQRRSER 

LPRRTCPRGGAPGPGRSRLPRSLPPPSAIPGLRSPV 

WAAGLGGGGRREPSRGKGGAALRARHRSTMAE 

LGAGGDGHRGGDGAVRSETAPDSYKVQDICKNA 

SSRPASAISGQNNNHSGNKPDPPPVLRVDDRQRL 

ARERREEREKQLAAREIVWLEREERARQHYEKH 

LEERJKKRLEEQRQKEERRRAAVEEKRRQRLEED 

KERHEAVVRRTMERSQKPKQKHNRWSWGGSLH 

GSPSIHSADPDRRSVSTMNLSKYVDPV1SKRLSSS 

SATLLNSPDRARRLQLSPWESSWNRLLTPTHSF 

LARSKSTAALSGEAVIPICPRSASCSPnMPYKAAH 

SRNSMDRPKLFVTPPEGSSRRRIIHGTASYKKERE 

REKVLFLTSGTRRAVSPSNPKARQPARSRLWLPS 

KSLPHLPGTPRPTSSLPPGSVKAAPAQVRPPSPGN 

IRPVKREVKVEPEKKDPEKEPQKVANEPSLKGRA 

PLVKVEEATVEERTPAEPEVGPAAPAMAPAPAS 

APAPASAPAPAPVPTPAMVSAPSSTVNASASVKT 

SAGTTDPEEATRLLAEKRRLAREQREKEERERRE 

QEELERQKREELAQRVAEERTTRREEESRRLEAE 

QAREKEEQLQRQAEERALREWEEAERAQRQKEE 

EARVREEAERVRQEREKHFQREEQERLERKKRL 

bbi^4KRTKRTEATDKKTSDQRNGDIAKGALTGG 

TEVSALPCTTNAPGNGKPVGSPHVVTWO^KVT 

VESTPDLEKQPNENGVSVQNENFEEUNLPIGSKP 
SRLDVTNSESPEIPLNPILAFDDEGTLGPLPQVDG 
VQTQQTAEVI 


3073 


A 


67 


2415 


PPRVCRDHVCLICWDPIAGTGGSRSTMPALPLDQ 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

Inratinn 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
. nucleotide 
location 

f*rtF"r*f»crwkTi H \ no 

corr capunuiiig 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystcinc, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenyIaIanine, OGIycine, H=Histidine, 
£=lsoleucine, K=Lysine, l>Leucine, M=Methioninc, 
i^Asparaginc, r^rronne, v=v*iutamine, k— Arginine, a— oenne, 
T«Threonine, V^Valine, W«Tryptophan, Y^Tyrosinc, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










LQITHKDPKTGKLRTSPALHPEQKADRYFVLYKP 

PPKDNIPALVEEYLERATFVANDLDWLLALPHD 

KFWCQVIFDETLQKCLDSYLRYVPRKFDEGVAS 

APEVVDMQKRLHRSVFLTFLRMSTHKESKDHFIS . 

PSAFGEILYNNFLFDIPKILDLCVLFGKGNSPLLQ 

KMIGNIFTQQPSYYSDLDETLPTILQVFSNILQHC 

GLQGDGANTTPQKLEERGRLTPSDMPLLELKDIV 

LYLCDTCTTLWAFLDIFPLACQ'EFQKHDFCYRLA 

SFYEAAIPEMESAIKXRRLEDSKLLGDLWQRLSH 

SRKKLMEEFHIILNQICLLPE.ESSCDNIQGFIEEFL 

QIFSSLLQEKRFLRDYDALFPVAEDISLLQQASSV 

LDETRTAYELQAVESAWEGVDRRKATDAKDPSV 

IEEPNGEPNGVTVTAEAVSQASSHPENSEEEECM 

GAAAAVGPAMCGVELDSLISQVKDLLPDLGEGFI 

LACLEYYHYDPEQVMKILEERLAPTLSQLDRNL 

DREMKPDPTPLLTSRHNVFQNDEFDVFSRDSVDL 

SRVHKGKSTRKEENTRSLLNDKRAVAAQRQRYE 

QYSWVEEVPLQPGESLPYHSVYYEDEYDDTYD 

nNOVfiANDAn^nnPT t^rrpfttpovt ptwprc 

GQEEDDDDEEDDADEEAPKPDHFVQDPAVLREK 
AEARRMAFLAICKGYRHDSSTAVAGSPRGHGQS 
RETTOERRKKF ANK" ATR AN1TMRR TM A DR KR^fcT 
GMIPS 


3074 


A 


3 


251 


GEARSPPPAA ALLDMDPETCPCPSGG SCTCADSC 
KCEGCKCTSCKKSCCSCCPAECEKCAKDCVCKG 
GEAAEAEAEKCSCCQ 


3075 


A 


255 . 


982 


SQFSLSQVLVDSAEEGSLAAAAELAAQKREQRL 
RKFRELl^MRNEARKLNH^ 
M^AKKARLEWELKEEEKKKECAARGEDYEKVK 
LLEISAFDAFRWFRTCKTfTRTrhJPnT rtP^nVA A AHT 

RQYHRLTKQIKPDMETYERLREKHGEEFFPTSNS 
LLHGTHVPSTEEIDRMVIDLEKQIEKRDKYSRRR 
PYNDDADIDYINERNAKFNKKAERFYGK 
KQNLERGTAV 


3076 


A 


255 


982 


SQFSLSQVLVDSAEEGSLAAAAELAAQKREQRL 
RKFRELI^MRNfEARKLNHQEVVEEDKRLKLPAN 
WEAKKARLEWELKEEEKKKECAARGEDYEKVK '. 
LLFTSAFDAFRWFRTOn<TRTCNn>r>T fiPCjnVA a aat 

RQYHRLTKQ1XPDMETYERLREKHGEEFFPTSNS 
LLHGTlWPSTEEIDRMVIDLEKQiEKRDKYSRRR 
PYNDDADroYINERNAKJmKAERFYGKYTAEI 
KQNLERGTAV 


3077 


A 


1 


968 


FRLRPRRACAQLLWHPAAGMASWAKGRSYLAP 

GLLQGQVAIYTGGATGIGKAIVKELLELGSNWI 

ASRKLERLKSAADELQANLPPTKQARVIPIQCNIR 

NEEEVNNLVKSTLDTFGKINFLVNNGGG 

EfflSSKGWHAVLETl^TGTFYMCKAVYSSWMK 

KHGGSTVNITTVPTK" Af^FPT AVPT^Ci A AR A nWNTT T 

KSLAFEWACSGIRINCVAPGVIYSQTAVENYGSW 
GQSFFEGSFQKIPAKRIGVPEEVSSWCFLLSPAA 
SFITGQSVDVDGGRSLYTHSYEVPDHDNWPKGA. 
GDLSVVKKMKETFKEKAKL 


3078 


A 


2 


3508 


FVRESGKAPVTFDDITVYLLQEEWVLLSQQQKEL 

CGSNKLVAPLGPTVA2^PELFRKFGRGPEPWLGS 

VQGQRSLLEHHPGKKQMGYMGEMEVQGPTRES 



244 



WO 01/57190 



PCT7US01/04098 



SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
• acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteinc, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, OGIycine, H=Histidine, 
I<=Isoleucine, K=Lysine, L^Leucine, M^Methionine, 
N^Asparagine, P=Proline, Q=Glutaminc, R^Arginine, S=Serine, 
T=Threoninc, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknowii, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










GQSLPPQKKAYLSHLSTGSGHffiGDWAGRNRKL 

LIGPRSIQKSWFVQFPWLIMNEEQTALFCSACREY 

PSIRDKRSRLIEGYTGPFKVETLKYHAKSKAHMF 

CWfALAARDPIWAARFRSIRDPPGDVLASPEPLF 

TADCPIFYPPGPLGGFDSMAELLPSSRAELEDPGG 

DGAIPAMYLDCISDLRQKEITDGIHSSSDINILYN 

DAVESCIQDPSAEGLSEEVPVVFEELPWFEDVA 

VYFTREEWGMLDKRQKELYRDVM^ 

LGPAAABCPDLISKLERRAAPWIKDPNGPKWGKG 

I^PGNKKMVAVREADTQASAADSALLPGSPVEA 

RASCCSSSICEEGDGPRRIKRTYRPRSIQRSWFGQ 

FPWLVIDPKETKLFCSACIERPNLHDKSSRLVRG 

YTGPFK VETLKYHE V SKAHRLC VNTVEIKEDTPH 

TALVPEISSDLMANMEHFFNAAYSIAYHSRPLND . 

FEKILQLLQSTGTVILGKYRNRTACTQFIKYISETL 

KREILEDVRNSPCVSVLLDSSTDASEQACVGIYIR 

YFKQMEVKESYITLAPLYSETADGYFETWSALD 

ELDIPFRKPGWVVGLGTDGSAMLSCRGGLVEKF 

QEV1PQLLPVHC V AHRLHL AV VD ACG SIDLVKK 

CDRHIRTVFKTYQSSNIOULNELQEGAAPLEQEIIR 

LKDLNAVRWVASRRRTLHALLVSWPALARHLQ 

RVAEAGGQIGHRAKGMLKLMRGFHFVKFCHFL 

LDFLSIYRPLSEVCQKEIVLITEVNATLGRAYVAL 

ESLRHQAGPKEEEFNASFKDGRLHGICLDKLEVA 

EQRFQADRERTVLTGIEYLQQRFDADRPPQLKN 

MEVFDTMAWPSGIELASFGNDDILNLARYFECSL 

PTGYSEEALLEEWLGLKT1AQHLPFSMLCKNALA 

ynUKrrJLJLoJs^LMA V V Vt vrxo 1 oCCbRCjFKAMN 

RIRTDERTKLSNEVLNMLMMTAVNGVAVTEYD 

PQPAIQHWYLTSSGRRFSHVYTCAQVPARSPASA 

RLRKEEMGALYVEEPRTQKPPILPSREAAEVLKD 

CIMEPPERLLYPHTSQEAPGMS 


3079 


A * 


343 


1513 


FSPLEPRLCSLGGWGALQAGEPCQPSRAGCGRE 

GATMGCTLSAEERAALERSKAIEKNLKEDGISAA 

KDVKLLLLGAGESGKSTIVKQMKIIHEDGFSGED 

VKQYKPVVYSNTIQSLAAIVRAMDTLGIEYGDK 

ERKADAKMVCDVVSRMEDTEPFSAELLSAMMR 

LWGDSGIQECFNRSREYQLNDSAKYYLDSLDRIG 

AADYQPTEQDILRTRVKTTGIVETHFTFKNLHFR 

T Trri\?rzrZfXQ CPU V VYI 7TUf~*1?T?Y\\lT A TTCOI TAT C/^vm " 

JLrJL/ Voul^JtvoiiKlsJ^WixlUrJbJJ V i AllrCVAL/oOYD 

QVLIffiDETTNRMrlESLKLFDSICKbJKWFTDTSII 

LFLNKKDIFEEKIKKSPLTICFPEYTGPSAFTEAVA 

YIQAQYESKNKSAHKEIYSHVTCATDTNNIQFVF 

DAVTDVnAKNLRGCGLY 


3080 


A 


41 . 


997 


EARTARELTDGVTDGLTMADQPKPISPLKNLLA 

GGFGGVCLVFVGHPLDTVKVRLQTQPPSLPGQPP 

MYSGTFDCFRKTLFREGITGLYRGMAAPIIGVTP 

MFAVCFFGFGLGKKLQQKHPEDVLSYPQLFAAG 

MLSGVFTTGIMTPGERIKCLLQIQASSGESKYTGT 

T T\r^ A VVT VACr/^TT>PTVlT , T\7T *TT X m T""\t JT* A O/^Tv A 

LUUAKKL Y \jpr OIKCjl Y KO 1 VL 1 JLMRD VPASGM 

YFMTYEWLK2OTTPEGKRVSELSAPRTLVAGGIA 

GIFNWAVAIPPDVLKSRFQTAPPGK^NGFRDVL 

RELIRDEGVTSLYKGFNAVMIRAFPANAACFLGF 

EVAMKFLNWATPNL 


3081 


A 


3 


1996 


IMADMEDLFGSDADSEAERKDSDSGSDSDSDQE 
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SEQID 
NO: 


Method 


Predicted 
beginning . 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 

nucleotide 

location 

corresponding • 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCystctnc, D=Aspar(ic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G^Glycine, H=Histidine, 
I=Iso»eucine, K=Lysine, L=Leucine, M^Methionine, 
N=Asparagine, P=Prolirie, Q=Clutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possib!e nucleotide deletion, 
V=possible nucleotide insertion 






s . 




NAASGSNASGSESDQDERGDSGQPSNKELFGDD 

SEDEGASHHSGSDNHSERSDNRSEASERSDHEDN 

DPSDVDQHSGSEAPNDDEDEGHRSDGGSHHSEA 

EGSEKAHSDDEKWGREDKSDQSDDEKIQNSDDE 

ERAQGSDEDKLQNSDDDEKMQNTDDEERPQLS 

DDERQQLSEEEKANSDDERPVASDNDDEKQNSD 

DEEQPQLSDEEKMQNSDDERPQASDEEHRHSDD 

EEEQDHKSESARGSDSEDEVLRMKRKNAIASDSE 

ADSDTEVPKDNSGTMDLFGGADDISSGSDGEDK 

PPTPGQPVDENGLPQDQQEEEPIPETRIEVEIPKV 

NTDLGNDLYFVKLPNFLSVEPRPFDPQYYEDEFE 

DEEMLDEEGRTRLKLKVENTIRWRIRRDEEGNEI 

KESNART/KWSDGSMSLHLGNEVFDVYKAPLQG 

DHNHLFIRQGTGLQGQAVFKTKLTFRPHSTDSAT 

T-TRVTV/TTT QT A nUPCVTHI/ TDTT DXrf A nTJEPAD tt 1 
txrUSJvl i JLoJLAJL-'KL-oJv 1 yivliUL-r MAOKJJi'iiOyKl r. 

MnCICEEERLRASIRRESQQRRMREKQHQRGLSAS 
YLEPDRYDEEEEGEESISLAAIKNRYKGGIREERA 
RIYSSDSDEGSEEDKAQRLLKAKKLTSDEVRPNL 
FNSRGLSCTQEPTALNEELTDQAGTN 


3082 


A . 


3 


921 


VEFCLPASADSS^LVAASLAGVRKMATNFLAHE 
KIWFDKFKYDDAERRFYEQMNGPVAGASRQEN 
GASVILRDIARARENIQKSLAGSSGPGASSGTSGD 
HGELWRIASLEVENQSLRGWQELQQAISKLEA 

RLNVLEKSSPGHRATAPQTQHVSPMRQVEPPAK 

VP A TP A P'nPkTT'R'rs'nT'rM 'cnQTwxvx:x:Tw"c a a/^t dtc 
]SJrJ\ lrJ\r,UUE,uVUiuLr 

RLRQYAEKKAKKPALVAKSSILLDVKPWDDETD 
MAQLEACVRSIQLDGLVWGASKLVPVGYGIRKL 
QIQCVVEDDKVGTDLLEEEITKPEEHVQSVDIAA 
FNKI 


3083 


A 


3 


921. 


VEFCLPASADSSSLVAASLAGVRKMATNFLAHE 

KIWFDKFKYDDAERRFYEQMNGPVAGASRQEN 

GASVILRDIARARENIQKSLAGSSGPGASSGTSGD 

HGELWRIASLEVENQSLRGWQELQQAISKLEA 

RLNVLEKSSPGHRATAPQTQHVSPMRQVEPPAK 

RLRQYAEKRAKKPALVAKSSILLDVKPWDDETD 
MAQLEACVRSIQLDGLVWGASKLVPVGYGIRKL 
QIQCVVEDDKVGTDLLEEEITKFEEHVQSVDIAA 
FNKI 


3084 


A 


128 


4050 


KSIVKIRKRMAAETQTLNFGPEWLRALSSGGSITS 

PPLSPALPKYKLADYRYGREEMLALFLKDNKIPS 

DLLDKEFLPILQEEPLPPLALVP^TEEEQROTSMS 

VNSAAVLRLTGRGGGGTWGAPRGRSSSRGRGR 

GRGECGFYQRSFDEVEGVFGRGGGREMHRSQS 

WEERGDRRFEKPGRKDVGRPNFEEGGPTSVGRK 

HEFIRSESENWRJPREEQNGEDEDGGWRLAGSRR 

DGERWRPHSPDGPRSAGWREHMERRRRFEFDFR 

DRDDERGYRRVRSGSGSIDDDRDSLPEWCLEDA 

EEEMGTFDSSGAFLSLKKVQKEP1PEEQEMDFRP 

VDEGEECSDSEGSHNEEAKEPDKTNKKEGEKTD 

RVGVEASEETPQTSSSSARPGTPSDHQSQEASQFE 

RKDEPKTEQTEKAEEETRMENSLPAKVPSRGDE 

MVADVQQPLSQIPSDTASPLLILPPPVPNPSPTLRP 

VETPWGAPGMGSVSTEPPDEEGLKHLEQQAEK 

MVAYLQDSALDDERLASKLQEHRAKGVSIPLMH 



246 



WO 01/57190 PCT/US01/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residne of 

peptide 

sequence 


Predicted end 
nucleotide 
location , 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D^Aspartic Acid, 
E=Glutamic Acid, F=PhcnyIalanine, G=Glycine, HHHistidine, 
1-Isoleucine, K^Lysine, L=Leucine, MNMethionine, 
N^Asparagtne, P=Proline, Q=GIutamine, R^Arginine, S=Serine, 
T=Threonine, V=Valine, VV=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possibie nucleotide insertion 










EAMQKWYYKDPQGEIQGPFNNQEMAEWFQAG 

YFTMSLLVKRACDESFQPLGDIMKMWGRVPFSP 

GPAPPPHMGELDQERLTRQQELTALYQMQHLQY 

QQFLIQQQYAQVLAQQQKAALSSQQQQQLALLL 

QQFQTLKMRISDQN1IPSVTRS VS VPDTGSIWELQ 

PTASQPTVWEGGSVWDLPLDTTTPGPALEQLQQ 

LEKAKAAKLEQERREAEMRAKREEEERKRQEEL 

RRRQKGILRRQQEEERKRREEEELARRKQEEALR 

RQREQEIALRRQREEEERQQQEEALRELEERRRE 

EEERRKQEELLRKQEEEAAKWAREEEEAQRRLE 

ENRLRMEEEAARLRHEEEERKRKELEVQRQKEL 

MRQRQQQQEALRRLQQQQQQQQLAQMKLPSSS 

TWGQQSNTTACQSQATLSLAEIQKLEEERERQLR 

EEQRRQQRELMKALQQQQQQQQQKLSGWGNV 

SKPSGTTKSLLEIQQEEARQMQKQQQQQQQHQQ 

PNRARNNTHSNLHTSIGNS VWG SINTGPPNQWA 

SDLVSSIWSNADTKNSNMGFWDDAVKEVGPRN 

STNKNK^ASLSKSVGVSNRQNKKVEEEEKLLK 

LFQGVNKAQDGFTQWCEQMLHALNTANNLDVP 

1 r VbrL&±, vizbr YbvHD YJKA YLODTSEAKEr AK 

QFLERRAKQKANQQRQQQQLPQQQQQPPQQPP 

QQPQQQDSVWGMNHSTLHSVFQTNQSNNQQSN 

FEAVQSGKKKKXQKMVRADPSLLGFSVNASSER 

LNMGEIETLDDY 


3085* 


A 


,128 


4050 


KSIVKIRKRMAAETQTLNFGPEWLRALSSGGSITS 

PPLSPALPKYKLADYRYGREEMLALFLKDNKIPS 

DLLDKEFLP1LQEEPLPPLALVPFTEEEQRNFSMS 

VNSAAVLRLTGRGGGGTWGAPRGRSSSRGRGR 

GRGECGFYQRSFDEVEGVFGRGGGREMHRSQS 

WEERGDRRFEKPGRKDVGRPNFEEGGPTSVGRK 

HEFIRSESENWRIFREEQNGEDEDGGWRLAGSRR 

DGERWRPHSPDGPRSAGWREHMERRRRFEFDFR 

DRDDERGYRRVRSGSGSIDDDKDSLPEWCLEDA 

EEEMGTFDSSGAFLSLKKVQKEPIPEEQEMDFRP 

VDEGEECSDSEGSHNEEAKEPDKTNKKEGEKTD 

RVG VE ASEETPQTSSS SARPGTPSDHQSQEASQFE 

RKDEPKTEQTEKAEEETRMENSLPAKVPSRGDE 

MVADVQQPLSQIPSDTASPLLILPPPVPNPSPTLRP 

VETPVVGAPGMGSVSTEPDDEEGLKHLEQQAEK 

MVAYLQDSALDDERLASKLQEHRAKGVSIPLMH . 

EAMQKWYYKDPQGEIQGPFNNQEMAEWFQAG 

YFTMSLL VKRACDESFQPLGDIMKMWGRVPFSP 

GPAPPPHMGELDQERLTRQQELTALYQMQHLQY 

QQFLIQQQYAQVLAQQQKAALSSQQQQQLALLL 

QQFQTLKMRISDQNIIPSVTRSVSVPDTGSIWELQ 

PTASQPTVWEGGSVWDLPLDTTTPGPALEQLQQ 

LEKAKAAKLEQERREAEMRAKREEEERKRQEEL 

RRRQKGILRRQQEEERKRREEEELARRKQEEALR 

RQREQEIALRRQREEEERQQQEEALRRLEERRRE 

EEERRKQEELLRKQEEEAAKWAREEEEAQRRLE 

ENRLRMEEEA ART RWFRFFRKRICFT PVOT?nK"PT 

MRQRQQQQEALRRLQQQQQQQQLAQMKLPSSS 
TWGQQSNTTACQSQATLSLAEIQKLEEERERQLR 
EEQRRQQRELMKALQQQQQQQQQKLSGWGNV 
SKPSGTTKSLLEIQQEEARQMQKQQQQQQQHQQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F«Phenylalanine, G=Glycine, H^Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, W«=Tryptophan, Y^Tyrosine, 
X«=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










^NRARIWTHSNLHTSIGNSVWGSINTGPPNQWA 

SDLVSSIWSNADTKNSNMGFWDDAVKEVGPRN 

STNKNK^ASLSKSVGVSNRQNKKVEEEEKLLK 

LFQGVNKAQDGFTQWCEQMLHALNTANNLDVP 

lr Vori^JsJiVc.or i xivrUJ x IKA ibui; 1 obAJKJir AK 

QFLERRAKQKANQQRQQQQLPQQQQQPPQQPP 

QQPQQQDSVWGMNHSTLHSVFQTNQSNNQQSN 

FEAVQSGKKKKKQKMVRADPSLLGFSVNASSER 

LNMGEIETLDDY 


3086 


A 


675 


1334 


LHPAATSTAWLHVPPGLSMALSWVLTVLSLLPL 

LEAQIPLCANLVPVPITOATLDRITGKWFYIASAF 

KJNJbbi JNKbvv^Jilv^Alrr Yr irrJK.lbI> I IrJLKbYQr 

RQDQCIYNTTYLNVQRENGTISRYVGGQEHFAH 

LLILRDTKTYMLAFDVNDEKNWGLSVYADKPET 

TKEQLGEFYEALDCLRIPKSDWYTDWKXDKCE 

PLEKQHEKERKQEEGES 


3087 


A 


1 


1575 


CTPVARSMATTATCTRFTDDYQLFEELGKGAFS 

WRRCVKKTSTQEYAAKIINTKKLSARDHQKLE 

REARICRLLKHPNIVRLHDSISEEGFHYLVFDLVT 

GGELFEDIVAREYYSEADASHCfflQILESVNHIHQ 

HDIVHRDLKPENLLLASKCKGAAVKLADFGLAIE 

VQGEQQAWFGFAGTPGYLSPEVLRKDPYGKPVD 

IWACGVILYILLVGYPPFWDEDQHKLYQQIICAG 

AYDFPSPEWDTVTPEAKNLINQMLtrNPAKRlTA 

DQALKJHDPWVCQRSTVASMMHRQETVECLRKFN 

ARRKLKGAILTTMLVSRNFSAAKSLLNKKSDGG 

VKPQSNNKNSLVSPAQEPAPLQTAMEPQTTVVH 

NA i UKjUsXib l bbCN 111 bUEDLKVKJPwQEIlKJTEQ 

LIEAINNGDFEAYTKICDPGLTSFEPEALGNLVEG 

MDFHKFYFENLLSKNSKPIHTTILNPHVHVIGED 

AACIAYIRLTQYIDGQGRPRTSQSEETRVWHRRD 

GKWLNVHYHCSGAPAAPLQ 


3088 


A 


12 


1039 


SSVAEFPERVQLSQPQNWNFSGAGGAWSLDFAE 

QLKWSAELARLGESIMDGKQGGMDGSKPAGPR 

DFPGIRLLSNPLMGDAVSDWSPMHEAAIHGHQL 

SLRNLISQGWAVNnTADHVSPLHEACLGGHLSC 

VKILLKHGAQVNGVTADWHTPLFNACVSGSWD 

CVNLLLQHGASVQPESDLASPIHEAARRGHVEC . 

V IN oJulA i OOJN JJJWKlbxiJLO 1 r L Y LACn-M^^KACV 

KKLLESGADVNQGKGQDSPLHAVARTASEELAC 

LLMDFGADTQAKNAEGKRPVELVPPESPLAQLF 

LEREGPPSLMQLCRLRIRKCFGIQQHHKITKLVLP 


3089 


A 


73 


432 


DMAGLMTIVTSLLFLGVCAHHIIPTGSVVLPSPCC 
MFFVSKRIPENRVVSYQLSSRSTCLKAGVIFTTKK 
GQQFCGDPKQEWVQRYMKNLDAKQKKASPRA 
RAVAVKGPVQRYPGNQTTC 


3090 


A 


4627 


611 


LMEAGGGGGALPAGVETMVLTLGESWPVLVGR 

RFLSLSAADGSDGSHDSWDVERVAEWPWLSGTI 

RAVSHTDVTKKDLKVCVEFDGESWRKRRWIEV 

YSLLRRAFLVEHNLVT AERK^PFT^FRTVOWPATT 

YKPLLDKAGLGSITSVRFLGDQQRVFLSKDLLKP 

IQDVNSLRLSLTDNQIVSKEFQALIVKHLDESHLL 

KGDKNLVGSEVKIYSLDPSTQWFSATVVNGNPA 

SKTLQVNCEEIPALKIVDPSLIHVEVVHDNLVTC 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locution 

corresponding 

to first amino 

acid . residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A~Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phcny lala nine, G=G lycine, H=Histidine, 
Msoleucine, K=Lysine, ^Leucine, M=Mcthionine, 
N=Asparagine, P=ProIine, Q=GIutamine, R=Argininc, S=Serine, 
T=Threoninc, V^Valine, W^Tryptophan, Y=Tyrosine, 
X^Un known, *=Stop.codon, /^possible nucleotide deletion, . 
\=possib!e nucleotide insertion 










GNSARIGAVKRKSSENNGTLVSKQAKSCSEASPS 
MCPVQSVPTTVFKEILLGCTAATPPSKDPRQQST 
PQAANSPPNLGAKIPQGCHKQSLPEEISSCLNTKS 
EALRTKPDVCKAGLLSKSSQIGTGDLKILTEPKGS 
CTQPKTNTDQENRLESVPQALTGLPKECLPTKAS 
SKAELEIANPPELQKHLEHAPSPSDVSNAPEVKA * 
GVNSDSPNNCSGKKVEPSALACRSQNLKESSVK 
VDNESCCSRSNNKIQNAPSRKSVLTDPAKLKKLQ 
QSGEAFVQDDSCVNIVAQLPKCRECRLDSLRKD 
KEQQKDSPVFCRPFHFRRLQFNKHGVLRVEGFLT 
PNKYDNEAIGLWLPLIXNVYGIDLDTAKYILANI 
GDHFCQMVISEKEAMSTTEPHRQVAWKRAVKG 
VREMCDVCDTTIFNLHWVCPRCGFGVCVDCYR 
MKRKNCQQGAAYKTFSWLKCVKSQIHEPENLM 
PTQIIPGKALYDVGDIVHSVRAKWGIKANCPCSN 
RQFKLFSKPASKEDLKQTSLAGEKPTLGAVLQQ 
NPSVLEPAAVGGEAASKPAGSMKPACPASTSPLN 
WLADLTSGNVNKENKEKQPTMPILKNEIKCLPPL 
PPLSKSSTVLHTFNSTILTPVSNNNSGFLRNLLNSS 
TGKTENGLKNTPOLDDIFASLVQNKTTSDLSKR 
PQGLTIKPSILGFDTPHYWLCDNRLLCLQDPNNK 
. SNWNVFRECWKQ<3QPVMVSGVHHKLNSELWK 
PESFRKEFGEQEVDLVNCRTNEIITGATVGDFWD 
GFEDVPNRLKNEKEPMVLKLKDWPPGEDFRDM 
MPSRFDDLMANIPLPEYTRRDGKLNLASRLPNYF 
VRPDLGPKMYNA YGLITPEDRK YGTTNLHLD VS 
DAANVMVYVGIPKGQCEQEEEVLKTIQDGDSDE 
L> i jjsjkjt JUGOi^lvr O/vL W rll Y AAJsJJ 1 bisJKbr LKJv 
VSEEQGQENPADHDPIHDQSWYLDRSLRKRLHQ 
EYGVQGWAIVQFLGDVVFIPAGAPHQVHNLYSC 
IKVAEDFVSPEHVKHCFWLTQEFRYLSQTHTNHE 
DKLQVKNVIYHAVKDAVAMLKASESSFGKP 


3091. 


A 


97 


1838 


B^GARRGGWKRKMPSTDLLMLKAFEPYLEILEV 

YSTKAKNYVNGHCTKYEPWQLIAWSVVWTLLI 

VWGYEFVFQPESLWSRFKKKCFKLTRKMPIIGRK 

IQDKLNKTKDDISKIsfMSFLKVDKEYVKALPSQG 

LSSSAVLEKLKEYSSMDAFWQEGRASGTVYSGE 

EKLTELLVKAYGDFAWSNPLHPDIFPGLRKIEAEI 

VRIACSLFNGGPDSCGCVTSGGTESILMACKAYR 

DLAFEKGIKTPEIVAPQSAHAAFNKAASYFGMKJ . 

VRWLTXMMEXHDVRAMRRAISRNTAMLVCSTP 

QFPHGVDDPVPEVAKLAVKYKIPLHVDACLGGFL 

IVFMEKAGYPLEHPFDFRVKGVTSISADTHKYGY 

APKGSSLVLYSDKKYRNYQFFVDTDWQGGIYAS 

PTIAGSRPGGISAACWAALMHFGENGYVEATKQI 

IYRLSNLMTAKGWNLNQLQFPPSIHFCITLLHAR 
KRVAIQFLKDIRESVTQIMKNPKAKTTGMGAIYG 
MAQTTVDRNMGAELSSVFLDSLYSTDTVTQGSQ 
MNGSPKPH 


3092 


A 


79 


2652 


LCSQNSPEDWVNFSSEKQKRYPWYWTGRKLRSE 

RAMKIQKKLTGCSRLMLLCLSLELLLEAGAGNIH 

YSVPEETDKGSFVGNIAKDLGLQPQELADGGVRJ 

VSRGRN4PLFALNPRSGSLITARRIDREELCAQSM 

PCLVSFNILVEDKMKLFPVEVEnDINDNTPQFQL 
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SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine C=Cysteinc, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenyl alanine, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N^Asparagtne, P^Proline, Q=G!utamine, R=Arginine, S«Serine, 
T=Threonine, V^Valine, W*=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop cod on, /-possible nucleotide deletion, 
^possible nucleotide insertion 










EELEFKMNEITTPGTRVSLPFGQDLDVGMNSLQS 

YQLSSNPHFSLDVQQGADGPQHPEMVLQSPLDR 

EEEAVHHLDLTASDGGEPVRSGTLRIYIQWDAN 

DNPPAFTQAQYHINVPENVPLGTQLLMVNA1DP 

DEGANGEVTYSFHNVDHRVAQIFRLDSYTGEISN 

KEPLDFEEYKMYSMEVQAQDGAGLMAKVKVLI . 

KVLDVhnDNAPEVTrtSVTTAVPENFPPGTIIALISV 

HDQDSGDNGYTTCFIPGNLPFKLEKLVDNYYRL 

VTERTLDRELISGYNITITAIDQGTPALSTETHISL 

LVTDINDNSPVFHQDSYSAYIPENNPRGASIFSVR 

AHDLDSNENAQI7YSLIEDTIQGAPLSAYLSINSD 

TGVLYALRSFDYEQFRDMQLKVMARDSGDPPLS 

SNVSLSLFLLDQKDNAPE3LYPALPTDGSTGVEL 

APRSAEPGYLVTKVVAVDRDSGQNAWLSYRLL 

KA'SEPGLFSVGLHTGEVRTARALLDRDALKQSL 

WAVQDHGQPPLSATVTLTVAVADRIPDDLADLG 

o Jjllr o AJSJ'iN JL/olJJL ILYLVV AllAA V.^C V Jr L Ar VI V 

LLAHRLRRWHKSRLLQASGGGLASTPGSHFVGV 
DGVRAFLQTYSHEVSLTADSRKSHLEFPQPNYAD 
TLISQESCEKKGFLSAPQSLLEDKKEPFSQVNFCD 
ECISYLEKNNS 


3093 


A 


1 


3868 


PPDNQKLGLLEALLKIGDWQHAQNIMDQMPPYY 

AASHKLIALAICKLIHITIEPLYRSVTSWAVDHAG 

FLESDPCDSTVGHLLSRVGVPKGAKGSPVNALQ 

NKRAPKQAESFEDLRRDVFNMFCYLGPHLSHDPI 

LFAKVVRIGKSFMKEFQSDGSKQEDKEKTEVILS 

CLLSITDQVLLPSLSLMDCNACMSEELWGMFKT 

FPYQHRYRLYGQWKNETYNSHPLLVKVKAQTID 

RAKYIMKRLTKENVKPSGRQIGECLSHSNPTILFD 

YVCFEILSQIQKYDNLITPVVDSLKYLTSLNYDVL 

ACILSNCnEALANPEKERMKHDDTTISSWLQSLA 

SFCGAVFRKYPIDLAGLLQYVANQLKAGKSFDL 

LELKEVVQKMAGIEITEEMTMEQLEAMTGGEQL 

KAEGGYFGQIRNTKKSSQRLKDALLDHDLALPL 

CLLMAQQRNGVIFQEGGEKHLKLVGKLYDQCH 

DTLVQFGGFLASNLSTEDYIKRVPSIDVLCNEFHT 

PHDAAFFLSRPMYAHHISSKYDELKKSEKGSKQ 

QHKVHKYITSCEMVMAPVHEAVVSLHVSKVWD 

DISPQFYATFWSLTMYDLAVPHTSYEREVNKLK 

VQMKAmDNQEMPPNKKKKEKERCTALQDKLL 

EEEKKQMEHVQRVLQRLKLEKDNWLLAKSTKN 

ETITKJFLQLCIFPRCIFSAIDAVYCARFVELVHQQ 

KTPNFSTLLGYDRVFSDIIYTVASCTENEASRYGR . 

FLCCMLETVTRWHSDRATYEKECGNYPGFLT1L 

RA TGFDG GNKAD QLD YENFRHVVHK WHYKLT 

KASVHCLETGEYTHIRMLrVLTKILPWYPKVLNL 

GQALERRVHKICQEEKEKRPDLYALAMGYSGQL 

KSRKSYMIPENEFHHKDPPPRNAVASVQNGPGG 

GPSSSSIGSASKSDESSTEETDKSRERSQCGVKAV 

NKASSTTPKGNSSNGNSGSNSNKAVKENDKEKG 

KEKEKEKKEKTPATTPEARVLGKDGKEKPKEER 

PNKDEKARETKERTPKSDKEKEKFKKEEICAKDE 

KFKTTVPNAESKSTQEREREKEPSRERDIAKEMK 

SKENVKGGEKTPVSGSLKSPVPRSDIPEPEREQKR 

RKIDTHPSPSHSSTVKDSLEELKESSAKLYINHTPP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E^Glutamic Acid, F=PhcnylaIanine, G=Glycine, H=Histidine, 
I=Isoleucine, KHLysine, L=Leucine, M=Methionine, 
N^Asparagine, P^Proline, Q=Glutamine, R«Arginine, S=Serine, 
*T>Threoninc, V«=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 










KERKRDHSNNDREVPPDLTKRRKEENGTMGVSK 
HKSESPCESPYPNEKDKEKNKSKSSGKEKGSDSF 
KSEIOIDKISSGGKKESRHDKEKIEKKEKRDSSGG 
KEEKKHHKSSDKHR 


3094 


A 


2 


891 


AMLGTREPSRRGAGAVQAEVSERLAMAGPQQQ 

PPYLHLAELTASQFLEIWKHFDADGNGYIEGKEL 

ENFFQELEKARKGSGMMSKSDNFGEKMKEFMQ 

KYDKNSDGKIEMAELAQILPTEENFLLCFRQHVG 

ooAiir IVLriA W isJs. Y U 1 JJKdxj Y icANJbLKurLoJJLrL 

KKANRPYDEPKLQEYTQTILRMFDLNGDGKLGL 

SEMSRLLPVQENFLLKFQGMKLTSEEFNAIFTFY 

DBCDRSGYTOEHELDALLKDLYEKNKKEMNIQQL 

T>mUCSVMSLAEAGKLYRKDLEIVLCSEPPM 


3095 


A 


1685 


700 


RRPTGRPGALGAPAAGRVGMPLHVKWPFPAVPP 

LTWTLASSVVMGLVGTYSCFWTKYMNHLTVHN 

REVLYELIEKRGPATPLITVSNHQSCMDDPHLWG 

ILKLRHIWNLKLMRWTPAAAD1CFTKELHSHFFS 

LGKCVPVCRGAEFFQAENEGKGVLDTGRHMPG 

AUKi<JKJbJMjrUu V Y(^KUJVLL>r ILbKLNHGDW VH1F 

PEGKVNMSSEFLRFKWGIGRLIAECHLNPIILPLW 

HVGMNDVLPNSPPYFPRFGQKITVLIGKPFSALP 

VLERLRAENKSAVEMRKALTDFIQEEFQHLKTQ 

AEQLHNHLQAWEIGLACCLLDSWPAQSWG 


3096 


A 


6642 


4022 


FVPGLREPQWEPAQPSATMSAPSEEEEYARLVM 
EAQPEWLRAEVKRLSHELAETTREKIQAAEYGL 
AVLEEKHQLKLQFEELEVDYEAIRSEMEQLKEAF 
GQAHTNHKKVAADGESREESLIQESASKEQYYV 
RKVLELQTELKQLRNVLTNTQSENERLASVAQE 
LKEINQNVEIQRGRLRDDIKEYKFREARLLQDYS 
ELEEENISLQKQVSVLRQNQVEFEGLKHEIKRLE 
EETEYLNSQLEDAIRLKEISERQLEEALETLKTER 
EQKNSLRKELSHYMSINDSFYTSHLHVSLDGLKF 
SDDAAEPNNDAEALVNGFEHGGLAKLPLDNKTS 
TPKKEGLAPPSPSLVSDLLSELNISEIQKLKQQLM 
QMEREKAGLLATLQDTQKQLEHTRGSLSEQQEK 
VTRLTENLSALRRLQASKERQTALDNEKDRDSH 
EDGDYYEVDINGPEILACKYHVAVAEAGELREQ 
LKALRSTHEAREAQHAEEKGRYEAEGQALTEKV 
SLLEKASRQDRELLARLEKELKXVSDVAGETQG 
SLSVAQDELVTFSEELANLYHHVCMCNNETPNR 
VMLDYYREGQGGAGRTSPGGRTSPEARGRRSPI 
LLPKGLLAPEAGRADGGTGDSSPSPGSSLPSPLSD 
PRREPMNIYNLIAIIRDQIKHLQAAVDRTTELSRQ 
RIASQELGPAVDKDKEALMEEILKLKSLLSTKRE 
, QITTLRTVLKANKQTAEVALANLKSKYENEKAM 

X/TnPTTWTMTirT "P XTPT VAT -VCH A A TTCCCT T> A \/TC A TT> 

DEYITQLDEMQRQLAAAEDEKJCTLNSLLRMAIQ 
QKLALTQRLELLELDHEQTRRGRAKAAPKTKPA 
TPSVSHTCACASDRAEGTGLANQVFCSEKHSIYC 
D 


3097 


A 


1 


879 


MVKVVPATRGNLPRSQLTGTHQHCQPREPKITA 
SERLRRRPRATARLRAHAAPPEPPLAVFAPPSDR 
KELLALPVACDPVIASVMSWVQAASLIQGPGDK 
GDVFDEEADESLLAQREWQSNMQRRVKEGYRD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A<=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G^Glycine, H=Histidinc, 
I-Isoleucine, K=Lysine, L^Leucine, M-Methionine, 
N^Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V-=Vaiine, W=Tryptophan, Y=Tyrosine, 
X=Unkno\vn, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










TLSALLSWCHLHNnWSTl.INKINNLLDAVGQCEE 
YVLKHLKSITPPSHWDLLDSIEDMDLCHVVPAE 
KXIDEAKDERLCENNAEFTsIKNCSKSHSGIDCSYV 
ECCRTQEHAHSGKPKPHMDFGTDSQF 


3098 


A 


2 


505 


GAATLLRSASSAARKAAEAEQVWLHLHRYLSA 
DRRVI GT RFWGRPA^FRFP^T PORT K'RFT WMnn 

VEKGKKIFIMKCSQCHTVEKGGKHKTGPNLHGL 
FGRKTGQAPGYSYTAANKNKGIIWGEDTLMEYL 
ENPKKYIPGTKMIFVGIKKKEERADLIAYLKKAT 
NE 


3099 


A 


144 


1386 


WAVGQARSFPSHPRMSSWIWSRRWSPSVALRVT 

CTSTSSQRWTVLALSKPGSQQQVSMHTPAPGPPT 

AGHTEPPSEPPRRARVAKYRAKFDPRVTAKYDIK 

ALIGRGSFSRVVRVEHRATRQPYAIKMIETKYRE 

GREVCESELRVLRRVRHANIIQLVEVFETQERVY 

MVMELATGGELFDRIIAKGSFTERDATRVLQMV 

LDGVRYLHALGITORDLKPENLLYYHPGTDSKIII 

TDFGLASARKKGDDCLMKTTCGTPEYIAPEVLV 

1? If PYTWWnMW A T GVTA VTT T QGTK/TPFFnrVNTRT 

RLYRQILRGKYSYSGEPWPSVSNLAKDFIDRLLT 
VDPGARMTALQALRHPWVVSMAASSSMKNLHR 
SISQNLLKRASSRCQSTKSAQSTRSSRSTRSNKSR 
RVRERELREL 


3100 


A t 


3 ■ 


1500 


ARWNGRWVQVPAWPGPGCGTNASGERQRQLPR 

AWRPVGRTLGSEPIALAWSPPLYLFPIPLPSWAVS 

QPTPTLGTMFADLDYDEEEDKLGIPTVPGKVTLQ 

KDAQNLIGISIGGGAQYCPCLYIVQVFDNTPAAL 

DGTVAAGDEITGVNGRSIKGKTKVEVAKMIQEV 

KGEVTIHYNKLQADPKQGMSLDIVLKKVKHRLV 

ENMSSGTADALGLSRAILCNDGLVKRLEELERTA 

ELYKGMTEHTIQnXLRAFYELSQTHRGNGIPQSC 

AFGDVFSVIGVREPQPAASEAFVKFADAHRSIEK 

FGIRLLKTIKPMLTDLNTYLM<AIPDTRLTIKKYL 

nVl^FFYT 9VPT l^Vlf FVTTYnFFV^PTAT HFPT VPV 

STG>TYEYRLILRCRQEARARFSQMRKDVLEKME 
LLDQKHVQDIVFQLQRLVSTMSKYYNDCYAVLR 
DADVFPIEVDLAHTTLAYGLNQEEFTDGEEEEEE 
EDTAAGEPSRDTRGAAGPLDKGGSWCDS 


3101 


A 


1173 


197 


QGMDSKQQCVKLNDGHFMPVLGFGTYAPPEVP 

RSKALEVTKLAIEAGFRHIDSAHLYNNEEQVGLA 

IRSKIADGSVKREDIFYTSKLWSTFHRPELVRPAL 

ENSLKKAQLDYVDLYLIHSPMSLKPGEELSPTDE 

NGKVIFDIVDLCTTWEAMEKCKDAGLAKSIGVS 

SKLLDFCKSKDIVLVAYSALGSQRDKRWVDPNS 
PVLLEDPVLCALAKKHKRTPALIALRYQLQRGV 
WLAKSYNEQRIRQNVQVFEFQLTAEDMKAIDG 
LDRNILHYFNSDSFASHPNYPYSDEY 


3102 


A 


144 


1098 


EQPRPPPCGRRPLPLGSAPCRVRLGRAPRQAPAM 

SMLPSFGFTQEQVACVCEVLQQGGNLERLGRFL 

WSLPACDHLHKNESVLKAKAVVAFHRGNFREL 

YKILESHQFSPHNHPKLQQLWLKAHYVEAEBCLR 

GRPLGAVGKYRVRQKFPLPRTIWDGEETSYCFK 

EKSRGVLREWYAHNPYPSPREKRELAEATGLTT 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 


Predicted end 
• nucleotide 
location 
curr capuiiuiiig 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine 0=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F«=Phenylalamnc, G=Glycinc, H=Histidine, 
I=lsoieucine } K^Lysine, L=Leucine, M-Methionine, - 
N^Asparagine, P^Proline, Q=Glutaminc, R^Arginine, S=Serine, 
T=Threonine, V= Valine, W*=Tryptophan, Y=Tyrosine, 
X=Un known, *~Stop codon, A=possible nucleotide deletion, 
^possible nucleotide insertion 










TQVSNWFI<NRRQRDRAAEAKERENTENNNSSSN 
KQNQLSPLEGGKPLMSSSEEEFSPPQSPDQNSVLL 
LQGNMGHARSSNYSLPGLTASQPSHGLQTHQHQ 
LQDSLLGPLTSSLVDLGS 


3103 


A 


111 


1582 


LVYSWGCHIMADNDTDRNQTEKLLKRVRELEQ 

EVQRLKKEQAKNI<£DSNIRENSSGAGKTKRAFD 

FSAHGRimVALRMYMGWGYQGFASQENTNNTI 

EEKLFEALTKTRLVESRQTSNYHRCGRTDKGVS 

AFGQVISLDLRSQFPRGRDSEDFNVKEEANAAAE 

EIRYTHILNRVLPPDTRILAWAPVEPSFSARFSCLE 

RTYRYFFPRADLDIVTMDYAAQKYVGTHDFRJSIL 

CKMDVANGVINFQRTILSAQVQLVGQSPGEGRW 

QEPFQLCQFEVTGQAFLYHQVRCMMAELFLIGQ 

GMEKPEIIDELLNIEBCNPQKPQYSMAVEFPLVLY 

KTHMLYSMLQGLDTVPVPCGIGPKMDGMTEWG 
NVKPSVDCQTSAFVEGVKMRTYKPLMDRPKCQG 
LESRIQHFVRRGRIEHPHLFHEEETKAKRDCNDT 
LEEDNTNLETPTKRVCVDTEIKSII 


3104 


A 


227 


1519 


VTLIKMNAMLETPELPAVFDGVKLAAVAAVLYV 

IVRCLNLKSPTAPPDLYFQDSGLSRFLLKSCPLLT 

KEYIPPLIWGKSGHIQTALYGKMGRVRSPHPYGH 

RKFITMSDGATSTFDLFEPLAEHCVGDDITMVICP 

GIANHSEKQYIRTFVDYAQKNGYRCAVLNHLGA 

LPNIELTSPRMFTYGCTWEFGAMVNYIKKTYPLT 

QLVWGFSLGGNIVCKYLGETQANQEKVLCCVS 

VCQGYSALRAQETFMQWDQCRRFYNFLMADN 

TSLMQIDDNVMRKFHGYNSLKEYYEEESCMRYL 
. HRIYVPLMLVNAADDPLVHESLLTIPKSLSEKRE ' 
NVMFVLPLHGGHLGFFEGSVLFPEPLTWMDKLV 
VEYANAICQWERNKLQCSDTEQVEADLE 


3105 


A 


1 


. 1251 


MGLLLMILASAVLGSFLTLLAQFFLLYRRQPEPP 

ADEAAJRAGEGFRYTKPVPGLLLREYLYGGGRDE 

EPSGAAPEGGATPTAAPETPAPPTRETCYFLNATI 

LFLFRELRDTALTRRWVTKKIKVEFEELLQTKTA 

GRLLEGLSLRDX^GETWHKTIRLVRPVVPSAT 

GEPDGPEGEALPAACPEELAFEAEVEYNGGFHLA 

IDVDLVFGKSAYLFVKLSRWGRLRLVFTRVPFT 

HWFFSFVEDPLIDFEVRSQFEGRPMPQLTSIIVNQ 

T KKTTIf 'RTCRTT P>JVT^TR"PTCPFFPVnTT nfTFPFnPP 
jjj\jc\jj.r\j\^\jj. i jur in i iviivr ivir r jt r x i j_iV^v_jp JZ/jZixJiZiSZi 

HIHIQQWALTEGRLKVTLLECSRLLIFGSYDREA 
NVHCTLELSSSVWEEKQRSSIKTGTISLTAVFMG 
WHRVSEAFPGLWYKLLVDLPFWGLEDGGPLLT 
VPLRQCPG 


3106 


A 


972 


468. 


MAAAGAGRLRRVASALLLRSPRLPARELSAPAR 
LYrT^VVDHYENPR>rVGSLDKTSKNVGTOT VG 

i-> X 1 XXVXN. V V J~/J. X 1 i_/l>X XVI \ V \J LJ 1 1 1 •'XV X ±JX\J. y V VJ X VJXv V >J 

APACGDVMKLQIQXHDEKGKIVDARFKTFGCGSA 
IASSSLATEWVKGKTVEEALTIKNTDIAKELCLPP 
VKLHCSMLAEDAIKAALADYKLKQEPKKGEAE 
KK 


3107 


A 


106 


1221 


TCQDVRSVFSLVRANIFGEESTAGAGWHREEDM 
RKELQLSLSVTLLLVCGFLYQFTLKSSCLFCLPSF 
KSHQGLEALLSHRRG1VFLETSERMEPPHLVSCS 
VESAAKIYPEWPWFFMKGLTDSTPMPSNSTYPA 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

lucrum! 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine OCysteine, D^Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=GIycine, H=Histidine, 
I=Isoleucine, K-Lysine, L=Leucine, M=Methioninc, 
N^Asparagine^P^Proline, Q^Glutamine, R^Arginine, S=Serine, 
T=Threonine, V*=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unkno\vn, *«Stop codon, /= possible nucleotide deletion, 
\~possible nucleotide insertion 










FSFLSAIDNVFLFPLDMKRLLEDTPLFSWYNQINA 
SAER3STWLHISSDASRLAnWKYGGIYMDTDVISIR 

NFVEHYNSAIWGNQGPELMTRMLRVWCKLEDF 
QEVSDLRCLNISFLHPQRFYPISYREWRRYYEVW 
DTEPSF^SYALHLWNHMNQEGRAVIRGSNTLV 
ENLYRKHCPRTYRDLIKGPEG S VTGELGPGNK 


3108 


A 


1612 


839 


EVALFCFEMAAGMYLEHYLDSIENLPFELQRNFQ 
LMRDLDQRTEDLKAEIDKLATEYMSSARSLSSEE 
KLALLKQIQEAYGKCKEFGDDKVQLAMQTYEM 

VTYK"HTPPT TYTnT APPPArVT T^PVr^rPQQPiVr^CCCC 
V J^rvJnLli\XVL«JL/ 1 JL/ Lsf\X\Jr JD/\JLyJL JSJDJVV^iJio 1 U o o o o 

KGKKKGRTQKEKKAARARSKGKNSDEEAPKTA 
QKKLKLVRTSPEYGMPSVTFGSVHPSDVLDMPV 
DPNEPTYCLCHQVSYGEMIGCDNPDCSIEWFHFA 
CVGLTTKPRGKWFCPRCSQERKKK 


3109 


A 


1 


2613 


MVAVRAAGPREGASQDEAGTVWAPMTGCPCQC 

RPGPSWLLVDTLEPETAYPVQRPGPEQAGNQRL . 

QMKRAQFGPHDWLSLPVPPGPSWLLVDTLEPET 

AYQFSVLAQNKLGTSAFSEWTVNTLAFP1TTPEP.. 

LVLVTPPRCLIANRTQQGVLLSWLPPANHSFPIDR 

YIMEFRVAERWELLDDGIPGTEGEFFAKDLSQDT 

WYEFRVLAVMQDLISEPSNIAGVSSTDIFPQPDLT 

EDGLARPVLAGIVATICFLAAAILFSTLAACFVNK 

QRKRKLKRKKDPPLSITHCRKSLESPLSSGKVSPE 

SIRTLPJVPSESSDDQGQPAAKRMLSPTREKELSL 

YKKTKRAIS SKICYS V AKAEAEAEATTPIELISRGP 

DGRFVMDPAEMEPSLKSRRIEGFPFAEETDMYPE 

FRQSDEENEDPLVPTSVAALKSQLTPLSSSQESYL 

PPPAYSPRFQPRGLEGPGGLEGRLQATGQARPPA 

PRPFHHGQYYGYLSSSSPGEVEPPPFYVPEVGSPL 

SSVMSSPPLPTEGPFGHPTIPEENGENASNSTLPLT 

QTPTGGRSPEPWGRPEFPFGGLETPAMMFPHQLP 

PCDVPESLQPKAGLPRGLPPTSLQVPAAYPGILSL 

EAPKGWAGKSPGRGPVPAPPAAKWQDRPMQPL 

VSQGQLRHTSQGMGPVLPYPEPAEPGAHGGPST 

FGLDTRWYEPQPRPRPSPRQARRAEPSLHQVVLQ 

P^PT QPT Tf^PT QQPTnCPTnT A AP APDPPnT T (~\r\ A 

EMSEITLQPPAAVSFSRKSTPSTGSPSQSSRSGSPS 
. YRPAMGFTTLATGYPSPPPGPAPAGPGDSLDVFG 
QTPSPRRTGEELLRPETPPPTLPTLGKLRRDRPAP 
ATSPPERALSKL 


3110 


A 


88 


924 


ILGSRTMSLTNTKTGFSVKD1LDLPDTNDEEGSV 
AEGPEEENEGPEPAKRAGPLGQGALDAVQSLPL 
KNPFYDSSDNPYTRWLASTEGLQYSLHGLAAGA 
PPQDSSSKSPEPSADESPDNDKETPGGGGDAGKK 

SLniLTPTQVKIWFQNHRYKMKJRARAEKGMEVT 
PLPSPRRVAVPYLVRDGKPCHALKAQDLAAATF 
QAGIPFSAYSAQSLQHMQYNAQYSSASTPQYPT. 
A HP! VOAOOWTW 


3111 


A 


595 


291 


PSVASLARRFSGRALWPPSHSVPGNRALCPRLLH 
GmPGGNQRELARQKNMKKQSDSVKGKRRDD 
GLSAAARKQRDSTPRDSEIMQQKQKKANEKKEE 
PK 


3112 


A 


3641 


1555 


APMLQIHHFSFKLIFQNIHKSKFISQRLSQNADST 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 

nucleotide 

location 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Plienylalanine, G=Glycine, H^Histidine, 
I=Iso]eucine, K=Lysine, LHLeudne, M=Methionine, 
n— Asparagine, r— rronne, \l— Olutamme, K=Arginine, S=Senne, 
^Threonine, V=Valine, W^Tryptophan, V=Tyrosine, 
X«Unknown, *«Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 


- 








RHTNLSNTHYSDLIVWNCCLFFR2WCNEFFLKS 

CHFAQEREGSGDLCNSRAEKTKSAACVIFRRPPV 

APLIPYPLITKEDINAIEMEEDKKDLISREISKFRDT 

HKKLEEEKGKKEKERQEIEKERRERERERERERE 

RREREREREREREREKEKERERERERDRDRDRTK 

ERDRDRDRERDRDRDRERSSDRNKDRSRSREKS 

RDRERERERERERERERERERERERERERERERE , 

REREKDKKRDREEDEEDAYERRKLERKLREKEA 

AYQERLKNWEIRERKKTREYEKEAEREEERRRE 

MAKEAKRLKEFLEDYDDDRDDPKYYRGSALQK 

RLRDREKEMEADERDRKREKEELEEIRQRLLAE 

GHPDPDAELQRMEQEAERRRQPQIKQEPESEEEE 

EEKQEKEEKREEPMEEEEEPEQKPCLKPTLRPISS 

APSVSSASGNATPNTPGDESPCGIIIPHENSPDQQ 

QPEEHRPKIGLSLKLGASNSPGQPNSVKRKKLPV 

TKGTVNTEEKRKH1KSLIEKIPTAKPELFAYPLDW 
SIXODSILMERRIRPWTNKKnEYIGEEEATLVDLVC 
SKVMAHSPPQSELDDVAMVLDEEAEVFIVKMWR 
LLIYETEAKKIGLVK 


3113 : 


A 


1 


669 


VCAGERDPCSTPLAKPAAGGAENLSFGKQPGLET 
NILKMTTPNKTPPGADPKQLERTGTVREIGSQAV 

WQQFRRKTTVKTLCIYADYKSDESYTPSKISVRV 
GNNFHNLQEIRQLELVEPSGWIHVPLTDNHKKPT 
RTFMIQIAVLANHQNGRDTHMRQIKIYTPVEESSI 
GKFPRCTTIDFMMYRSIR 


3114 


A 


1 


1613 


MTSKEESRRQQPTAGPAGQGKLPSPSEPQLPTPP 

TRSLHHFRRPLSPSREAQAHIAPSSELHLPQSQSA 

GPPPLGAGTEVELVVPGRDEGSRGALPGSSGVKF 

VWRKJVRFPVSDQVRTLSISRLMRRLLEMMQTL 

VQFHGWRSLLGRTLGTIMNTMYVMMAQILRSH 

LIKATVIPNRVKMLPYPGnRNRMMSTHKSKKKI 

REYYRLLNVEEGCSADEVRESFHKLAKQYHPDS 

GSNTADSATFIRIEKAYRKVLSHVIEQTNfASQSK 

GEEEEDVEKFKYKTPQHRHYLSFEGIGFGTPTQR 

EKHYRQFRADRAAEQVMEYQKQKLQSQYFPDS 

VIVKNIRQSKQQKITQAIERLVEDLIQESMAKGDF 

YQPEWILKQKEISDTIEQLREAILVSRKKLGNPMT 
PTEKKQWHVCEQFQENIRKLNKRINDFNLIVPI 
LTRQKVHFDAQKEIVRAQKIYETLLKTKEVTDRN 
PNNLDQGEGEKTPEIKKGFLNLMDLVEIY 


3115 


A 


1 


2036 


FRHRCGCLSYCRSRRGIRRVEPLRRARARVGPRF 

RPLCRMEURSNFKSNLHKVYQAIEEADFFAIDGE 

FSGISDGPSVSALTNGFDTPEERYQKLKKHSMDF 

LLFQFGLCTFKYDYTDSKYITKSFNFYVFPKPFNR 

SSPDVKFVCQSSSEDFLASQGFDFNKGFRKGIPYL 

NQEEERQLREQYDEKRSQANGAGALSYVSPNTS 

KCPVTIPEDQI<XFmQVVEKJEDLLQSEENKNLDL 

EPCTGFQRIO.IYQTLSWKYPKGIHVETLETEKKE 

RYIVISKVDEEERKRREQQKHAKEQEELNDAVG 

FSRVIHAIANSGKXVIGHNMLLDVMHTVHQFYC 

PLPADLSEFKEMTTCVFPRLLDTKLMASTQPFKD 

IINNTSLAELEKRLKETPFNPPKVESAEGFPSYDT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
■ nucleotide 
location 
corresponding 
to last amino . 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C«=Cysteine, D«Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I^lsoleucine, K=Lysine, L^Lcucine, M«Methionine, 
ii— /\sparagine, r— i ronne, vi s= v»iuiamine, K— Argmine, j^oerme, 
T-Threoninc, V^Valine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\~possiblc nucleotide insertion 










ASEQLHEAGYDAYITGLCFISMANYLGSFLSPPKI 
HVSARSKLffiPFFNKLFLMRVMDIPYLNLEGPDL 
QPKRDHVLHVTFPKEWKTSDLYQLFSAFGNIQIS 
WIDDTSAFVSLSOPFOVKTA VWT^TCYAF^VT? TOT 

YAEYMGRKQEEKQIKRKWTEDSWKEADSKRLN 
PQCIPYTLQNHYYRNNSFTAPSTVGKR>ILSPSQE 
EAGLEDGVSGEISDTELEQTDSCAEPLSEGRKKA 
KKLKRMKKELSPAGSISKNSPATLFEVPDTW 


3116 


A 


3 


1443 


TREAPMALAVAPWGRQWEEARALGRAVRMLQ 

RLEEQCVDPPLLSVSPPSLRDLLPRTAQLLREVAH 

SRRAAGGGGPGGPGGSGDFLLIYLANLEAKSRQ 

VAALLPPRGRRSANDELFRAGSRLRRQLAKLAII 

FSHMHAELHALFPGGKYCGPIMYQLTKAPAHTF 

WRESCGARCVLPWAEFESLLGTCHPVEPGCTAL 

ALRTT1DLTCSGHVSIFEFDVFTRLFQPWPTLLKN 

WQLLAVNHPGYMAFLTYDEVQERLQACRDKPG 

SYIFRPSCTRLGQWAIGYVSSDGSILQTIPANKPLS 

QVLLEGQKDGFYLYPDGKTHNPDLTELGQAEPQ 

ORIHVSEEOLOLYWAMDSTFFT PK'TPAF^'Wft'nV 

KIEPCGHLLCSCCLAAWQHSDSQTCPFCRCEIKG 
WEAVSIYQFHGQATAEDSGNSSDQEGRELELGQ 
VPLSAPPLPPRPDLPPRKPRNAQPKVRLLKGNSPP 
AALGPQDPAPA 


3117 


A 


296 


3547 


ERHSSPLLQHILTHALMRNKJCHSNNWLAQHWF 
QSSIILCFSPyGRTLRVRARKFPAIVNCTAIDWFH 
AWPQEALVSVSRRFIEETKGIEPVHKDS1SLFMAH 
. VHTTVNEMSTRYYQNERRHNYTTPKSFLEQTSLF 
KNLLKKKQNEVSEKKERLVNGIQKLKTTASQVG 
DLBCARLASQEAELQLRNHDAEALITK1GLQTEKV 
SREKTIADAEERKVTAIQTEVFQKQRECEADLLK 
AEPALVAATAALNTLNRVNLSELKAFPNPPIAVT 
NVTAAVMVLLAPRGRVPKDRSWKAAKVFMGK 
VDDFLQALINYDKEHIPENCLKVVNEHYLKDPEF 
NPNLIRTKSFAAAGLCAWVINIIKFYEVYCDVEP 
KRQALXQA3^ELAAATEKLEAIRKKLVVSANYD 
IEKSEKIRWGQSIKSFEAQEKTLCGDVLLTAAFVS 
YVGPFTRQYRQELVHCKWVPFLQQKVSIPLTEG 
LDLISMLTDDATIAAWNNEGLPSDRMSTENAADL 
THCERWPLVIDPQQQGIKWIKNKYGMDLKVTHL 
GQKGFLNAIETALAFGDVILIENLEETIDPVLDPL 
LGRNTn<XGKYIRIGDKECEFNKNFRLILHTKLAN 
PHYKPELQAQTTLLNFTVTEDGLEAQLLAEWSI 
ERPDLEKLKLVLTKHQNDFKIELKYLEDDLLLRL 
SAAEGSFLDDTBCLVERLEATKTTVAEIErlKVIEA 
KENERKINEARECY11PVAARASLLYFVINDLQKI 
NPLYQFSLKAFNVLFHRAIEQADKVEDMQGRISI 
LMESITHAVFLYTSQALFEKDKLTFLSQMAFQIL 
LRKICEmPLELDFLLRFTVEHTHLSPVDFLTSQSW 
SAIKAIAVMEEFRGIDRDVEGSAKQWRKWVESE 
CPEKEKLPQEWKKKSLIQKLILLRAMRPDRMTY 
ALRNFVEEKLGAKYVERTRLDLVKAFEESSPATP 
IFFILSPGVDALKDLEE^GKRLGFTIDSGKFHNVSL 
GQGQETVAEVALEKASKGGHWVILQNVHLVAK 
WLGTLEKLLERFSQGSHRDYRVFMSAESAPTPD 
EHUPQGLLENSDQTNEPPTGMLANLHAALYNFD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locution 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F-Phenylalanine, G=Glycine, H=Histidine t 
I=Iso leucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Argininc, S=Serine, 
T = Threoninc, V— Valine, W~Tryptophan, Y—Tyrosine, 
X^Un known, * tt »Stop cod on, /^possible nucleotide deletion, 
\=possiblc nucleotide insertion 










Q 


3118 


A 


1 


226 


PYSLSTSCLGSPTSPRLEMDPNCSCATGGSCTCTG 
SCKCKECKCNSCKKSECGAISRNLGLSQVRGRKP 
ELGMEE 


3119 


A 


1254 


4133 


PLATLTMEEQGHSEMEDPSESHPHIQLLKSNREL 
LVTHIRNTQCLVDNLLKNDYFSAEDAEIVCACPT 
QPDKVRKILDLVQSKGEEVSEFFLYLLQQLADAY 
VDLRPWLLEIGFSPSLLTQSKWVNTDPVSRYTQ 
QLRHHLGRDSKFVLCYAQKEELLLEErVMDTIME 
LVGFSNESLGSLNSLACLLDHTTGILNEQGETIFIL 
GDAGVGKSMLLQRLQ1SLWATGRLDAGVKFFFH 
FRCRMFSCFKESDRLCLQDLLFKHYCYPERDPEE 
VFAFLLRFPHVALFTFDGLDELHSDLDLSRVPDS 
SCPWEPAHPLVLLANLLSGKLLKGASKLLTART 
GIEVPRQFLRKKVLLRGFSPSHLRAYARRMFPER 
ALQDRLLSQLEANPNLCSLCSVPLFCWHFRCFQH 
FRAAFEG SPQLPDCTMTLTDVFLL VTEVHLNRM 
QPSSLVQRNTRSPVETLHAGRDTLCSLGQVAHR 
GMEKSLFVFTQEEVQASGLQERDMQLGFLRALP 
ELGPGGDQQSYEFFHLTLQAFFTAFFLVLDDRVG 
TQELLRFFQEWMPPAGAATTSCYPPFLPFQCLQG 
. SGPAREDLFK1«DHFQFTOLFLCGLLSKAKQKLL 
RHLVPAAALRRKRKALWAHLFSSLRGYLNSLPR 
VQVESFNQVQAMPTFIWMLRCIYETQSQKVGQL 
AARGICANYLKLTYCNACSADCSALSFVLHHFP 
KRLALDLDNNNLNDYGVRELQPCFSRLTVLRLS 
VNQITDGGVKVLSEELTKYKIVTYLGLYNNQ1TD 
VGARYVTKILDECKGLTHLKLGKNKITSEGGKY 

Lsf\L,S\ V JV1N orvolO-C, V OlVl W OIN V OJJUvJ AlS^Ar AilA 

LRKHPSLTTLSLASNGISTEGGKSLARALQQNTSL 
EILWLTQNELNDEVAESLAEMLKVNQTLKHLWL 
IQNQITAKGTAQLADALQSNTGITEICLNGNLDCP 
EEAKVYEDEKRHCF 


3120 


A 


43 


1004 


QLWGFAAGSDSRPAMGCDGGTIPKRHELVKGPK 
KVEKVDKDAELVAQWNYCTLSQEILRRPIVACE 
LGRLYNKDAVIEFLLDKSAEKALGKAA SHIKSIK 
NVTELKLSDNPAWEGDKGNTKGDKHDDLQRAR 
FICPVVGLEMNGRHRFCFLRCCGCVFSERALKEI 

IS^Ali V Cxi 1 LUAAr v^XjUU V I VLrJN kj 1 rvfcJJ VD V l^rv 1 X 

MEERRLRAKLEKKTKKPKAAESVSKPDVSEEAP 
GPSKVKTGKPEEASLDSREKKTNLAPKSTAMNE 
SSSGKAGKPPCGATrCRSIADSEESEAYKSLFTTHS 
SAKRSKEESAIWVTHTSYCF 


3121 


A 


3 


1490 


HASGPTRPVSWSFfiKLKTMKHLLLLLLCVFLVK 

SQGVNDNEEGFFSARGHRPLDKKREEAPSLRPAP 

PPISGGGYRARPAKAAATQKKVERKAPDAGGCL 

HADPDLGVLCPTGCQLQEALLQQERPIRNSVDEL 

NNNVEAVSQTSSSSFQYMYLLKDLWQKRQKQV 

KDNENVVmYSSELEI<HQLYJX)ETVNSNIPTNLR 

VLRSELENLRSKIQKLESDVS AQMEYCRTPCTV S 

CMPVVSGKECEEIIRKGGETSEMYLTOPDSSVKP 

YRVYCDMNTENGGWTVIQNRQDGSVDFGRKW 

DPYKQGFGNVATNTDGKNYCGLPGEYWLGNDK 

ISQLTRMGPTELLIEMEDWKGDKVKAHYGGFTV 

QNEANKYQISVNKYRGTAGNALMDGASQLMGE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alaninc OCystcine, D=Aspartic Acid, 
E^GIutamic Acid, F=Phenylalanine, G^Glycine, H=Histidine, 
I^lsoleucine, K«Lysine, L=Leucine, M=Mcthionine, 
N«Asparagine, P*=ProIine, Q=GIutamine, R=Argininc, S=Serine, 
T=Threonine, V=Valinc, W=Tryptophan, V«Tyrosine, 
X=Un known, *=Stop cod on, /^possible nucleotide deletion, 
^possible nucleotide insertion 










NRTMTIHNGMFFSTYDRDNDGWLTSDPRKQCSK 
EDGGGWWYNRCHAANPNGRYYWGGQYTWDM 
AKHGTDDGVVWMNWKGSWYSMKKMSMKIRP 
FFPQQ 


3122 


A 


3 


1490 


HASGPTRPVSWSFHKLKTMKHLLLLLLCVFLVK 

SQGVNDNEEGFFSARGHRPLDKKREEAPSLRPAP 

PPISGGGYRARPAKAAATQKKVERKAPDAGGCL 

HADPDLGVLCPTGCQLQEALLQQERPIRNSVDEL 

NNNVEAVSQTSSSSFQYMYLLKDLWQICRQKQV 

KDNENVVNEYSSELEKHQLYIDETVNSNIPTNLR 

VLRSILENLRSKIQKJLESDVSAQMEYCRTPCTVS 

CNIPVVSGKECEEIIRKGGETSEMYLIQPDSSVKP 

YRV YCDMNTENGG WTVIQNRQDGSVDFGRKW 

DPYKQGFGNVATNTDGKNYCGLPGEYWLGNDK 

ISQLTRMGPTELLEEMEDWKGDKVKAHYGGFTV 

/~\XTC A \TVVATC\7\TVVT) /""P A /"'XT AT > /fT\r*> A 0/~\T > /cf^i-* 

^NJbANKY^IbVNKYKG rAGNALMDGASQLMGE 

NRTMTIHNGMFFSTYDRDNDGWLTSDPRKQCSK 

EDGGGWWYNRCHAANPNGRYYWGGQYTWDM 

AKHGTDDGVVWMNWKGSWYSMKKMSMKIRP 

FFPQQ 


3123 


A 


3 


1490 


HASGPTRPVSWSFHKLKTMKHLLLLLLCVFLVK 

SQGVNDNEEGFFSARGHRPLDKKREEAPSLRPAP 

PPISGGGYRARPAKAAATQKKVERKAPDAGGCL 

HADPDLGVLCPTGCQLQEALLQQERPIRNSVDEL 

NNNVEAVSQTSSSSFQYMYLLKDLWQKRQKQV 

KDNENVVNEYSSELEKHQLYIDETVNSNIPTNLR 

VLRSILENLRSKIQKLESDVSAQMEYCRTPCTVS 

CNIPVVSGKECEEIIRKGGETSEMYLIQPDSSVKP 

YRVYCDMNTENGGWTVIQNRQDGSVDFGRKW 

DPYKQGFGNVATNTDGKNYCGLPGEYWLGNDK 

ISQLTRMGPTELLIEMEDWKGDKVKAHYGGFTV 

^INIrrAlNivY ^1S> VJNK YKu 1 ACjN AlJVLDGASQLMGE 

NRTMTIHNGMFFSTYDRDNDGWLTSDPRKQCSK 

EDGGGWWYNRCHAANPNGRYYWGGQYTWDM 

AKHGTDDGVVWMNWKGSWYSMKKMSMKIRP 

FFPQQ 


3124 


A. 


3 


544 


RVDDFVLLRSRLALRWLSHVRRPSRRVPRMPRG 

QT) CI? TCD A/f A TJTJ A CD A TtfW /TD A A TJO Ti A n\ r A Ann a a 

£>KbK 1 oKMAPrAbKAFQMixAArRFAPVAQPPAA 

APPSAVGSSAAAPRQPGLMAQMATTAAGVAVG 

SAVGHTLGHAITGGFSGGSNAEPARPDITYQEPQ 

GTQPAQQQQPCLYEIKQFLECAQNQGDIKLCEGF 

NEVLKQCRLANGLA 


3125 


A 


3 


571 


• GNSYNHRSLAAYPYMSHSQHSPYLQSYHNSSAA" 
A^| 1KOJJ1J 1 Dt^K 1 1 VliiNublKrNGKGKKIRKPR 
TIYSSLQLQALNHRFQQTQYLALPERAELAASLG 
LTQTQVKIWFQNKRSKFKKLLKQGSNPHESDPL 
QGSAALSPRSPALPPVWOVSASAKGVSMPPNSY 
MPGYSHWYSSPHQDTMQRPQMM 


3126 


A . 


43 


5377 


LSVFFPIPVDGRDRGSNPSLESTSSELSTSTSEGSL 
SAMSGRNELHSRLHPHPQSSLPMMFSPPESLLAS 
Cn^RGNFAEAHOVTFTFNLKSSPSSGFT MFMFRY 

QEVIQELAQ\nEHKIENQNSDAGSSTIRRTGSGRST 
LQAIGSAAAAGMVFYSISDVTDKLLNTSGDPPM 
LQEDFWISTALVEPTAPLREVLEDLSPPAMAAFD 
LACSQCQLWKTCKQLLETAERRLNSSLERRGRRI 
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,SEQD> 
NO: 


Method 


Predicted 
. beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence v 


Amino acid sequence (A~Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I-Isolcucine, KHLysinc, L=Leucinc, M-Mcthionine, 
N=Asparagine, P=Pro)ine, Q=Glutamine, R^Arginine, S=Serine, 
T=Thrconinc, V=Valine, W=Tryptophan, Y-Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\~possiblc nucleotide insertion 




■ 






DHVLLNADGIRGFPVVLQQISKSLNYLLMSASQT 

KSESVEEKGGGPPRCSITELLQMCWPSLSEDCVA 

SHTTLSQQLDQVLQSLREALELPEPRTPPLSSLVE 

QAAQKAPEAEAHPVQIQTQLLQKNLGKQTPSGS 

RQMDYLGTFFSYCSTLAAVLLQSLSSEPDHVEVK 

VGNPFVLLQQSSSQLVSHLLFERQVPPERLAALL 

AQENLSLSVPQVIVSCCCEPLALCSSRQSQQTSSL 

LTRLGTLAQLHASHCLDDLPLSTPSSPRTTENPTL 

ERKPYSSPRDSSLPALTSSALAFLKSRSKLLATVA 

CLGASPRLKVSKPSLSWKELRGRREVPLAAEQV 

ARECERLLEQFPLFEAFLLAAWEPLRGSLQQGQS 

LAVNLCGWASLSTVLLGLHSPIALDVLSEAFEES 

LVARDWSRALQLTEVYGRDVDDLSSIKDAVLSC 

AVACDKEGWQYLFPVKDASLRSRLALQFVDRW 

PLESCLEILAYCISDTAVQEGLKCELQRKLAELQ 

VYQKILGLQSPP WCDWQTLRSCCVEDPSTVMN 

MILEAQEYELCEEWGCLYPIPRErlLISLHQKHLL 

HLLERRDHDKALQLLRRJPDPTMCLEVTEQSLDQ 

HTSLATSHFLANYLTTHFYGQLTAVRHREIQALY 

VGSKILLTLPEQHRASYSHLSSNPLFMLEQLLMN 

MKVDWATVAVQTLQQLLVGQEIGFTMDEVDSL 

LSRYAEKALDFPYPQREKRSDSVIHLQEIVHQAA 

DPETLPRSPSAEFSPAAPPGISSIHSPSLRERSFPPT 

QPSQEFVPPATPPARHQWVPDETESICMVCCREH 

FTMrl^RRHHCRRCGRLVCSSCSTKKMVVEGCRE 

NPARVCDQCYSYCNKDVPEEPSEKPEALDSSKSE 

SPPYSFVVRVPKADEVEWILDLKEEENELVRSEF 

YYEQAPSASLCIA1LNLHRDSIACGHQLIEHCCRL 

SKGLTNPEVDAGLLTDIMKQLIjSAKMMFVKAG 

QSQDLALCDSYISKVDVLNILVAAAYRHVPSLDQ 

ILQPAAVTRLRNQLLEAEYYQLGVEVSTKTGLDT 

TGAWHAWGMAGLKAGNLTAAREKFSRCLKPPF 

DLNQLNHGSRLVQDVVEYLESTVRPFVSLQDDD 

YFATLRELEATLRTQSLSLAVIPEGKIMNNTYYQ 

ECLFYLHNYSTNLAnSFYVRHSCLREALLHLLNK 

ESPPEV7IEGIFQPSYKSGKLHTLENLLESIDPTLES 

WGKYLIAACQHLQKKNYYHILYELQQFMKDQV 

RAAMTCIRFFSHKAKSYTELGEKLSWLLKAKDH 

LKIYLQETSRSSGRKKTTFFRKKMTAADVSRHM 

NTLQLQMEVTRFLHRCESAGTSQITTLPLPTLFG 

NNHMK^VACKVMLGGKNVEDGFGIAFRVLQ 

TlPfiT T\A A MTVPU A ADHT \/T?ynii r VOTJTAnT T Vr*\T 
lJr\^LtL/J\J\Xvi L I ^KAAivV^JL. VJtlJsJblV I ofc-Iv^V^LLivC V 

SESGMAAKSDGDTILLNCLEAFKRIPPQCCFCSA 
QELEGLIQAIHNDDNKVRAYLICCKLRSAYLIAV 
KQEHSRATALVQQVQQAAKSSGDAVVQDICAQ 
WLLTSHPRGAHGPGSRK 


3127 


A 


467 


1259 


HLGPPLAWIPAASLTSTKGEFGVEDDRPARGPPP 

PKSEEASWSESGVSSSSGDGPFAGGEVDKRLHQL 

KTQLATLTSSLATVTQEKSRMEASYLADKKKMK 

RLITQQHDRAQEQSDHALMLRELQBCLLQEERTQ 
RQDLELRLEETREALAGRAYAAEQMEGFELQTK 
QLTREVEELKSELQAIRDEKNQPDPRLQELQEEA 
ARLKSHFQAQLQQEMRKVIIHISFKHQPLT 


3128 


A 


1854 


798 


ASGSPAPSSSSAMAAACGPGAAGYCLLLGLHLFL 
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SEQID 
NO: 


Method 


Predicted 

beginning- 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
. nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystcine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
J=Isoleucine, K^Lysine, L^Leucine, M=Methioninc, 
N— Asparagine, P^Proline, Q^Glutamine, R^Arginine, S^Serine, 
T^Threonine, V^Valine, W=Tryptophan, Y«Tyrosine, 
X=Un known, *=Stop codon, /-possible nucleotide deletion, 
\= possible nucleotide insertion 










LTAGPALGWNDPDRMLLRDVKALTLHYDRYTT 

SRRLDPIPQLKCVGGTAGCDSYTPKVIQCQNKG 

WDGYDVQWECKTDLDIAYKFGKTVVSCEGYES 

SEDQYVLRGSCGLEYNLDYTELGLQKLKJESGKQ 

HGFASFSDYYYKWSSADSCNMSGLmVVLLGIA 

FVVYKLFLSDGQYSPPPYSEYPPFSHRYQRFTOS 

AGPPPPGFKSEFTGPQNTGHGATSGFGSAFTGQQ 

GYENSGPGFWTGLGTGGILGYLFGSNRAATPFSD 

SWYYPSYPPSYPGTWMRAYSPLHGGSGSYSVCS 

NSDTKTRTASG YGGTRRR 


3129 


A 


2340 


1192 


ELARRPKQQSSEKSIWMIRNWLTIFILFPLKLVEK 

CESSVSLTVPPWKLENGSSTNVSLTLRPPLNATL 

VITFEITFRSKNITILELPDEVVVPPGVTNSSFQVT 

SQNVGQLTVYLHGNHSNQTGPRIRFLVIRSSAISn 

NQVIGWIYF^AWSISFYPQVIMNWRRKSVIGLSF 

DFVALNLTGFVAYSVFNIGLLWVPYIKEQFLLKY 

PNGVNPVNSNDVFFSLHAWLTLIIIVQCCLYERG 

GQRVSWPAIGFLVLAWLFAFVTMIVAAVGVITW 

LQFLFCFSYIKLAVTLVKYFPQAYMNFYYKSTEG 

WSIGNVLLDFTGGSFSLLQMFLQSYNNDQWTLIF 

GDPTKFGLGVFSIVFDVVFFIQHFCLYRKRPGYD 

QLN 


3130 


A 


31 


2026 


CWWPPLLPQLEPEPPPLRPRVAASQGGGMLGKG 

WGGGGGTKAPKPSFVS YVRPEEIHTNEKEVTEK 

EVTLHLLPGEQLLCEASTVLKYVQEDSCQHGVY 

GRLVCTDFKIAFLGDDESALDNDETQFKNKVIGE 

NDITLHCVDQIYGVFDEKKKTLFGQLKKYPEKLII 

HCKDLRWQFCLRYTKEEEVKRIVSGIIHHTQAP 

KLLKRLFLFS YATAAQNNTVTDPKNHTVMFDTL 

KDWCWELERTKGNMKYKAVSVNEGYKVCERL 

PAYFWPTPLPEENVQRFQGHGIPIWCWSCHNGS 

ALLBCMSALPKEQDDGILQIQKSFLDGIYKTIHRPP 

YEIVKTEDLSSNFLSLQEIQTAYSFCFKQLFLIDNST 

EFWDTDIKWFSLLESSSWLDIIRRCLIGCAIEITEC 

MEAQNMNVLLLEENASDLCCL1SSLVQLMMDPH 

CRTRIGFQSL1QKEWVMGGHCFLDRCNHLRQND 

KEEHQRQLSLPLTQSKSSPKRGFFREETDHLIKNL 

LGKRISKLENSSDELQDNFREFYDSWHSKSTDYH 

GLLLPHIEGPEIKVWAQRYLRWIPEAQILGGGQV 

ATLSKLLEMMEEVQSLQEKIDERHHSQQAPQAE 

APCLLRNSARLSSLFPFALLQRHSSKPVLPTSGW 

KALGDEDDLAKREDEFVDLGDV 


3131 


A 


126 


965 


QSRSRPRREGVGTGSRAVLCILATCGSKMSDIGD 

WFRSPAITRYWFAATVAVPLVGKLGL1SPAYLF 

LWPEAFLYRFQIWRPITATFYFPVGPGTGFLYLV 

NL^TLYQYSTRLETGAFDGRPADYLFMLLFNWI 

CTVITGLAMDMQLLMIPLIMSVLYVWAQLNRDM 

IVSFWFGTRFKACYLPWVILGFNY1IGGSVINEL1G 

.NLVGHLYFFLMFRYPMDLGGRKFLSTPQFLYRW 

LPSRRGGVSGFGVPPASMRRAADQNGGGGRHN 

WGQGFRLGDQ 


3132 


A 


2 


350 


FVAGWRALTAPSTSARLRAFGWQAAARLLVFG 
ARGVGLGSG APG SLPC YLRMD ALALLGGLVNV 
ARLPERWGPGRFDYWGNSHQJMHLLSVGSILQL 
HA G WPDLL W AAHHACPRD 
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SEQED 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 

' corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A= Ala nine OCysteinc, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, OGIycine, H=Histidine, 
I-Isoleucine, K=Lysine, L=Leucinc, M=Methionine, 
N=Asparagine, P^Prolinc, Q=Clutamine, R»Arginine, S=Serine, 
T^Threoninc, V<=Valine, W«=Tryptophan, Y=Tyrosine, 
X=Unknown, *=*Stop codon, /^possible nucleotide deletion, 
\=possiblc nucleotide insertion 


3133 


A 


1 


2921 


MTCFKGQKGEQRSHAFEANKDHKAXVPSPNLYS 
QLNALQFTVDERSILWLNQFLLDLKQSLNQFMA 
VYKLNDNSKSDEHVDVRVDGLMLKFVIPSEVKS 
ECHQDQPRAISIQSSEMIATNTRHCPNCRHSDLEA 
LFQDFKDCDFFSKTYTSFPKSCDNFNLLHPIFQRH 
AHEQDTKMHEIYKGNITPQLNKNTLKTSAATDV 
WAVYFSQFWIDYEGMKSGKGRPISFVDSFPLSIW 
ICQPTRYAESQKEPQTCNQVSLNTSQSESSDLAG 
RLKRKKLLKEYYSTESEPLTNGGQKPSSSDTFFR 
FSPSSSEADIHLLVHVHKHVSMQINHYQYLLLLF 
LHESLILLSENLRKDVEAVTGSPASQTSICIGELLR 
S AELALLLHPVDQ ANTLKSPV SESVSP V VPD YLP 
TENGDFLSSKRKQISRDINRIRSVTVNHMSDNRS 
MSVDLSHIPLKDPLLFKSASDTNLQKGISFMDYL 
SDKHLGKISEDESSGLVYKSG SGEIGSETSDKKDS 
FYTDSSSVLNYREDSNILSFDSDGNQNILSSTLTS 
KGNETIESIFKAEDLLPEAASLSENLDISKEETPPV 
RTLKSQSSLSGKPKERCPPNLAPLCVSYKNMKRS 
. SSQMSLDTISLDSMILEEQLLESDGSDSHMFLEKG 
NKKNSTTNYRGTAESVNAGANLQNYGETSPDAI 
STNSEGAQENHDDLMSWVFKITGVNGEIDIRGE 
DTEICLQVNQVTPDQLGN1SLRHYLCNRPVGSDQ 
KAVEHSKSSPEISLRFESGPGAVIHSLLAEKNGFL 
QCHIENFSTEFLTSSLMNIQHFLEDETVATVMPM 

I^T/^kX/CXTTVrKTT VnHCDT) OOTWOT CD A D\ 7 T*\ 7TJ TT"\UT 

ivi^ VbJN 1 J^rsJLisJL/JLibrivoo 1 V oLbrArv 1 VJilDriL 
V VERSDDG SFHIRDSHMLNTGNDLKEN VKSDS V 
LLTSGKYDLKKQRSVTQATQTSPGVPWPSQSAN 
FPEFSFDFTREQLMEENESLKQELAKAKMALAE 
AHLEKDALLHHIKKMTVE 


3134 


A 


9 


1579 


EEEGLSGGGPRVPCSLWGKQTMDYDFKAKLAA 

ERERVEDLFEYEGCKVGRGTYGHVYKARRKDG 

KDEKEYALKQIEGTGISMSACREIALLRELKHPN 

VIALQKVFLSHSDRKVWLLFDYAEHDLWHIIKFH 

RASKANKKPMQLPRS3V1VKSLLYQILDGIHYLHA 

NWVLHRDLKPAMLVMGEGPERGRVKIADMGF 

ARLFNSPLKPLADLDPVWTFWYRAPELLLGAR 

HYTKAIDI WAIGCBF AELLTSEPIFHCRQEDIKTSN 

PFHHDQLDRIFSVMGFPADKDWED1RKMPEYPT 

LQKDFRRTTYANSSLIKYMEKHKVKPDSKVFLL 

LQKLLTMDPTKRITSEQALQDPYFQEDPLPTLDV 

HQQPTAPPQQAAAPPQAPPPQQNSTQTNGTAGG 
AGAGVGGTGAGLQHSQDSSLNQVPPNKKPRLGP 
SGANSGGPVMPSDYQHSSSRLNYQSSVQGSSQS 
QSTLGYSSSSQQSSQYHPSHQAHRY 


3135 


A 


3 • 


1111 


ERKMAEPPSPVHCVAAAAPTATVSEKEPFGKLQ 

LSSRDPPGSLSAKKVRTEEKKAPRRVNGEGGSG 

GNSRQLQPPAAPSPQSYGSPASWSFAPLSAAPSPS 

SSRSSFSFSAGTAVPSSASASLSQPGPRKLLVPPTL 

LHAQPHHLLLPAAAAAASANAKSRRPKEKREKE 

RRRHGT GGARFAGH A 9R FFNGF VRTPT PRTYkTK'n 

KIKERDKEBLEREKKKHKVMNEIKJKENGEVKJLL 
KSGKEKPKTNIEDLQIKKVKKKKKKKHKENEKR 
KRPKMYSKSIQTICSGLLTDVEDQAAKGILNDNI 
KDYVGKNLDTKNYDSKIPENSEFPFVSLKEPRVQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A— Alanine (^Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleiicine, K=Lysine, L= Leucine, M=Methionine, 
N=Asparagine, P^Prolinc, Q=Glu famine, R^Arginine, S=Serine, 
T^Threonine, V=Valinc, W^Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, ^possible nucleotide deletion, 
Vpossible nucleotide insertion 










N>nLKRLDTLEFKQLIHIEHQPNGGASVIHCLQ 


3136 


A 


1442 


682 


TAAMSIFTPTNQIRLTNVAVVRMKRAGKRPEIAC 
YKNKWGWRSGVEKDLDEVLQTHSVFVNVSKG 
QVAKKEDLISAFGTDDQTEICKQILTKGEVQVSD 

iSJiivrl 1 v^i^il^IYLr i\UjJ\ 1 1 V .AX/ JVC V JNJrJti 1 KJvr 1 1 V 1 

LIEKAMKDIHYSVKTNKSTKQQALEVIKQLKEK 
MKlERAHMRLRFILPVNEGKKLKEKLKPLnCVIES 
EDYGQQLEIVCLIDPGCFREIDELIKKETKGKGSL 
EVLNLKDVEEGDEKFE 


3137 


A 


1 


3143 


MVEGKRHVLHGGRQERMRAKQKGKPLDCSSDL 

VRLIHYHHNSSPLHKQSSGPSSSPAAAAAPEKPG 

PKAAEVGDDFLGDFVVGERVWVNGVKPGWQV 

LGETQFAPGQWAGVVLDDPVGKNDGAVGGVR 

YFECPALQGIFTRPSKLTRQPTAEGSGSDAHSVES 

LTAQNLSLHSGTATPPLTSRVIPLRESVLNSSVKT 

GNESGSNLSDSGSVKRGEKDLRLGDRVLVGGTK 

TGVVRYVGETDFAKGEWCGVELDEPLGKNDGA 

VAGTRYFQCPPKFGLFAPEHKVIRJGFPSTSPAKA 

KKTKRMAMGVSALTHSPSSSSISSVSSVASSVGG 

RPSRSGLLTETSSRYARKISGTTALQEALKEKQQ 

HIEQLLAERDLERAEVAKATSHICEVEKEIALLK 

AQHEQYVAEAEEKLQRARLLVESVRKEKVDLSN 

QLEEERRKVEDLQFRVEEESITKGDLETQTQLEH 

ARIGELEQSLLLEKAQAERLLRELADNRLTTVAE 

KSRVLQLEEELTLRRGEIEELQQCLLHSGPPPPDH 

PDAAEILRLRERLLSASKEHQRESGVLRDKYEKA 

LKAYQAEVDKLRAANEKYAQEVAGLKDKVQQ 

ATSENMGLMDNWKSKLDSLASDHQKSLEDLKA 

TLNSGPGAQQKEIGELKAA/MEGIKMEHQLELGN 

LQAKHDLETAMHVKEKEALREKLQEAQEELAG 

LQRHWRAQLEVQASQHRLELQEAQDQRRDAEL 

RVHELEKLDVEYRGQAQAIEFLKEQISLAEKKML 

DYERLQRAEAQGKQEVESLREKLLVAENRLQAV 

EALCSSQHTHMffiSNDISEETIRTKETVEGLQDKL 

NKRDKEVTALTSQTEMLRAQVSALESKCKSGEK 

KVDALLKEKRRLEAELETVSRKTHDASGQLVLIS 

V^xiLLKisJiKb L M n,LK V LLL, Jb AN Kri or urbKJJLbKb 

VHKAEWRIKEQKLKDDIRGLREKLTGLDKEKSL 

SDQRRYSLIDPSSAPELLRLQHQLMSTEDALRDA 

LDQAQQVEKLMEAMRSCPDKAQHGNSGSANGI 

HQQDKAQKQEDKH 


3138 


A 


110 


2499 


QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSAL 

TPSIWPQEILAKYTQKEESAEQPEFYYDEFGFRV 

YKEEGDEPGSSLLANSPLMEDAPQRLRWQAHLE 

FTHNHDVGDLTWDKIAVSLPRSEKLRSLVLAGIP 

HGMRPQLWMRLSGALQKKRNSELSYREIVKNSS 

NDET1AAKQIEKDLLRTMPSNACFASMGSIGVPR 

LRRVLRALAWLYPEIGYCQGTGMVAACLLLFLE 

EEDAFWMMSAIIEDLLPASYFSTTLLGVQTDQRV 

LRHLIVQYLPRLDKLLQEHDIELSL1TLHWFLTAF 

AWVDTKI T T "RTWDT FFVFfr^PVT "POT TT flMT HT 

KEEELIQSENSASIFNIXSDIPSQMEDAELLLGVA 
MRLAGSLTDVAVETQRRKHLAYLIADQGQLLGA 
GTLTNLSQWRRRTQRRKSTITALLFGEDDLEAL 
KAKNIKQTELVADLREAILRVARHFQCTDPKNCS 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
. peptide 
sequence 


Predicted end 
nucleotide 
location 
• corresponding 
to Inst amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine (^Cysteine, D=A spar tic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycinc, H-Histidine, 
I=Isoleucine, KpLysine, L= Leu cine, M^Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R^Argininc, S^erine, 
T^Threonine, V=VaIine, W^Tryptophan, Y=Tyrosinc, 
X-Un known, *=S top codon, /-possible nucleotide deletion, 
V=possible nucleotide insertion 










WSRQLPGLLPNTALTPPTPLVGLCSLWQELTPD 

YSMESHQRDHENYVACSRSHRRRAKALLDFERH 

DDDELGFRKNDIITTVSQKDEHCWVGELNGLRG 

WFPAKFVEVLDERSKEYSIAGDDSVTEGVTDLV 

RGTLCPALKALFEHGLKKPSLLGGACHPWLFIEE 

PEELLYRAV.QSVNVTHDAVHAQMDVKLRSLICV 
GLNEQVLHLWLEVLCSSLPTVEKWYQPWSFLRS 
PGWVQIKCELRVLCCFAFSLSQDWELPAKREAQ 
QPLKEGVRDMLVKHHLFSWDVDG 


3139 


A 


110- 


2499 


QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSAL 

TPSIWPQEELAKYTQKEESAEQPEFYYDEFGFRV 

YKEEGDEPGSSLLANSPLMEDAPQRLRWQAHLE 

FTHNHDVGDLTWDKIAVSLPRSEKLRSLVLAGIP 

HGMRPQLWMRLS.GALQKKRNSELSYREIVKNSS 

NDETIAAKQIEKDLLRTMPSNACFASMGSIGVPR 

LRRVLRALAWLYPEIGYCQGTGMVAACLLLFLE 

EEDAFWMMSAIEEDLLPASYFSTTLLGVQTDQRV 

LRHLIVQYLPRLDKLLQEHDIELSLITLHWFLTAF 

ASVVDIKLLLRIWDLFFYEGSRVLFQLTLGMLHL 

KEEELIQSENSASIFNTLSDIPSQMEDAELLLGVA 

MRLAGSLTDVAVETQRRKHLAYL1ADQGQLLGA 

GTLTNLSQVVRRRTQRRKSTITALLFGEDDLEAL 

KAKNIKQTELVADLREAILRVARHFQCTDPKNCS 

WSRQLPGLLPNTALTPPTPLVGLCSLWQELTPD 

YSMESHQRDHENYVACSRSHRRRAKALLDFERH 

DDDELGFRKNDITOVSQKDEHCWVGELNGLRG 

WFPAKFVEVLDERSKEYSIAGDDSVTEGVTDLV 

RGTLCPALKALFEHGLKKPSLLGGACHPWLFIEE . 

A AftRFVFPriFA VT HXC'WUl T\T2T\fflT\n T 

PEELLYRAVQSVNVTHDAVHAQMDVKLRSLICV 
GLNEQVLHLWLEVLCSSLPTVEKWYQPWSFLRS 
PGWVQIKCELRVLCCFAFSLSQDWELPAKREAQ 
QPLKEGVRDMLVKHHLFSWDVDG 


3140 


A 


1 


4939 


SAALGASLAIPRPGLPGVHGRGPGTLSGRAMEG 

AEPRARPERLAEAETRAADGGRLVEVQLSGGAP 

WGFTLKGGREHGEPLVITKIEEGSKAAAVDKLL 

AGDEIVGINDIGLSGFRQEAICLVKGSHKTLKLV 

VKRRSELGWRPHSWHATKFSDSHPELAASPFTST 

SGCPSWSGRHHASSSSHDLSSSWEQTNLQRTLD 

HFSSLGSVDSLDHPSSRLSVAKSNSSIDHLGSHSK 

RDSAYGSFSTSSSTPDHTLSKADTSSAENILYTVG 

LWEAPRQGGRQAQAAGDPQGSEEKLSCFPPRVP 

GDSGKGPRPEYNAEPKLAAPGRSNFGPVWYVPD 

KKKAPSSPPPPPPPLRSDSFAATKSHEKAQGPVFS 

EAAAAQHFTALAQAQPRGDRRPELTDRPWRSAH 

PG SLGKG SGGPG CPQEAHADG S WPPSKDGASSR • 

LQASLSSSDVRFPQSPHSGRHPPLYSDHSPLCADS 

LGQEPGAASFQNDSPPQVRGLSSCDQKLGSGWQ 

GPRPCVQGDLQAAQLWAGCWPSDTALGALESL 

PPPTVGQSPRHHLPQPEGPPDARETGRCYPLDKG 

AEGCSAGAQEPPRASRAEKASQRLAASITWADG 

ESSRJCPQEIPLLHSLTQEGKRRPESSPEDSATRPP 

PFDAHVGKPTORSDRFATTLROTIQMHRAKLQK 

SRSTVALTAAGEAEDGTGRWRAGLGGGTQEGPL 
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seq n> 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine OCysteinc, D-Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=G1ycine, H^Histidine, 
I=Lsoleucine, K=Lysine, L=Leucine, M^Methionine, 
N^Asparagine, P^ProIine, Q=GIutamine, R=Arginine, S=Serine, 
T=Thrconine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\~p05sible nucleotide insertion 










AGTYKDHLKEAQARVLRATSFKRRDLDPNPGDL 

YPESLEHRMGDPDTVPHFWEAGLAQPPSSTSGGP 

HPPRIG G RRRFTAEQKLKS Y S EPEKMNE V GLTRG 

YSPHQHPRTSEDTVGTFADRWKFFEETSKPVPQR 

PAQKQALHGIPRDKPERPRTAGRTCEGTEPWSRT 

TSLGDSLNAHSAAEKAGTSDLPRRLGTFAEYQAS 

WKEQRKPLEARSSGRCHSADDILDVSLDPQERPQ 

HVHGRSRSSPSTDHYKQEASVELRRQAGDPGEP 

REELPSAVRAEEGQSTPRQADAQCREGSPGSQQ 

HPPSQKAPNPPTTSELSHCRGAPELPREGRGRAG 

TLPRDYRYSEESTPADLGPRAQSPGSPLHARGQD 

SWPVSSALLSKRPAPQRPPPPKREPRRYRATDGA 

PADAPVGVLGRPFPTPSPASLDVYVARLSLSHSPS 

VFSSAQPQDTPKATVCERGSQHVSGDASRPLPEA 

LLPPKQQHLRLQTATMETSRSPSPQFAPQKLTDK 

PPLLIQDEDSTRIERVMDNNTTVKMVPIKIVHSES 

QPEKESRQSLACPAEPPALPHGLEICDQIKTLSTSE 

QFYSRFCLYTRQGAEPEAPHRAQPAEPQPLGTQV 

PPEKDRCTSPPGLSYMKAKEKTVEDLKSEELARE 

rVGKDKSLADILDPSVKIKTTMDLMEGIFPKDEH 

LLEEAQQRRKLLPKIPSPRSTEERKEEPSVPAAVS 

LATNSTYYSTSAPKAELLIKMKDLQEQQEHEEDS 

GSDLDHDLSVKKQELIESISRKLQVLREARESLLE 

U V K£ AJN I V JLAj AJb V Jb Al V lSAj V U Kr bbr UKFRMFIG 

DLDKVVNLLLSLSGRLARVENALNNLDDGASPG 

DRQSLLEKQRVLIQQHEDAKELKENLDRRERIVF 

DILANYLSEESUVDYEHFVKMKSALIIEQRELED 

KIHLGEEQLKCLLDSLQPERGK 


3141 


A 


97 > 


1894 


SPRGATMETPPLPPACTKQGHQKPLDSKDDNTE 

KHCPVTVNPWHMKKAFKVMNELRSQNLLCDVT 

IVAEDMEISAHRWLAACSPYFHAMFTGEMSESR 

AKRVRIKEVDGWTLRMLIDYVYTAEIQVTEENV 

QVLLPAAGLLQLQDVKKTCCEFLESQLHPVNCL 

GIRAFADMHACTDLLNKANTYAEQHFADVVLSE 

EFLNLGIEQVCSLISSDKLTISSEEKVFEAVIAWV 

NHDKDVRQEFMARLMEHVRLPLLPREYLVQRV 

EEEALVKNSSACKNYLIEAMKYHLLPTEQRILMK 

SVRTRLRTPMNLPKLMVVVGGQAPKAIRSAECY 

DFKEQRWHQVAELPSRRCRAGMVYLAGLVFAV 

GGFNGSLRVRTWSYDPVKDQWTSVANMRDRR 

STLGAAVLNGLLYAVGGFDGSTGLSSVEAYNIKS 

NEWFHVAPMNTRRSSVGVGWGGLLYAVGGYD 

("1 A CD AVT CT\/TIPV\T A TT\IT \1 /TV T A CTljfPTD D O/" 1 A 

vjAoK'sJ i L»o 1 VxlC Y In A 1 1 JNr-W 1 YlAcMMKKouA 

GVGVLNNLLYAVGGHDGPLVRKSVEVYDPTTN 

A W JtvVi V AUMJN M v^KJtUN Au V OA V IN VjJUL x V V UvjlJ 

DGSCNLASVEYYNPTTDKWTVVSSCMSTGRSYA 
GVTVIDKPL 


3142 


A 


1211 


1311 


FSNLTTEKVAHAKEENLSMHQMLDQTLLELNN 
M 


3143 


A 


1809 


1041 


SEELDREKKLKEDSPRKTPNKESGVPSLPVSLTSI 

DSRGTRVAVSSPMSQHQSYIQYLHAYPYPQMYD 
PSHPAYRAVSPVLMHSYPGAYLSPGFHYPVYGK 
MSGREETEKVNTSPSVNTKTTTESKALDLLQQH 
ANQYRSKSPAPVEKATAEREREAERERDRHSPFG 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to Inst amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q— Glutamine, R^Arginine, S^Scrinc, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










QRHLHTHHHTHVGMGYPLIPGQYDPFQGLTSAA 
LVASQQVAAQASASGMFPGQRR 


3144 


A 


78 


604 


SVSGIVLDLLPYLHFLSNMNLDGSAQDPEKREYS 

9 VPVOR FHDnf V <!PP A/IT A WT4HP PWTFVWHP 

YHAMDIRCYHSGGPLHLGDIEDFDGRPCIVCPW 
HKYKITLATGEGLYQSINPKDPSAKPKWCSKGIK 
QRIHTVTVDNGNIYVTLSNEPFKCDSDFYATGDF 
KVTKSSS 


3145 


A 


2 


333 


RNSLLLPPLHLDNSTPAKMSCQQNQQQCQPPPK 
CPSPKCPPKSPVQCLPPASSGCAPSSGGCGPSSEG 
GCFLNHHRRHHRCRRQRPNSCDRGSGQQGGGS 
GCGHGSGGCC 


3146 


A 


3 


1151 


VCTALQEFGTRSTLLRCLDSGFRPGASRGLVGSW 

AAMESTLGAGIV1AEALQNQLAWLENVWLWITF 

LGDPKILFLF YFP A A YY A SRR V G 1A VL Wl SLITE W 

LNLIFKWFLFGDRPFWWVHESGYYSQAPAQVHQ 

FPSSCETGPGSPSGHCMITGAALWPIMTALSSQV 

ATRARSRWVRVMPSLAYCTFLLAVGLSRIFILAH 

FPHQVLAGLITGAVLGWLMTPRVPMERELSFYG 

T TAT AT MT *TT9T TVWTT PTT HT HT ^WQT^T A WW 

CERPEWIHVDSRPFASLSRDSGAALGLGIALHSPC. 
YAQVRRAQLGNGQKIACLVLAMGLLGPLDWLG 
HPPQISLFYIFNFLKYTLWPCLVLALVPWAVHMF 
SAQEAPPIHSS 


3147 . 


A 


1437 


.594 


RSFSLSFSLLSPSEMMALGAAGATRVFVAMVAA 
ALGGHPLLGVSATLNSVLNSNAIKNLPPPLGGAA 
GHPGSAVSAAPGILYPGGNKYQTIDNYQPYPCAE 

DEECGTDEYCASPTRGGDAGVQICLACRKRRKR 

c\av> xj a \a rrpnMvr F>jrjTr\/ q QFifYNrupp np tpctt 
v^iviivri/\iviw^rvjiN i v^jviNvJlv^, v ooJJv^lNrirxvVjJLllJtixi 1 1 

TESFGNDHSTLDGYSRRTTLSSKMYHTKGQEGS 

VCLRSSDCASGLCCARHFWSKICKPVLKEGQVC 

TKHRRICGSHGLEIFQRCYCGEGLSCRIQKDHHQ 

ASNSSRLHTCQRH 


3148 


A 


1 


1562 


MSTLYDIRAHKAQLLRFFASSDSNKALEQRRTLH 
. TPKLEHLDRVLYEWFLGKRSEGVPVSGPMLIEK 
AKDFYEQMQLTEPCWSGGWLWRFKARHGIICK 
LDASSEKQSADHQAAEQFCAFFRSLAAEHGLSA 
EQVYNADETGLFWRCLPNPTPEGGAVPGPKQGK 
. DRLTVLMCANATGSHRLKPLAIGKCSGPRAFKGI 
QHLPVAYKAQGNAWVDKEIFSDWFHHIFVPSVR 
EHFRTIGLPEDSKAVLLLDSSRAHPQEAELVSSN 
VFnFLPASVASLVQPMEQGIRRDFMRNFINPPVP 
LQGPHARYNMNDAIFSVACAWNAVPSHVFRRA 
WRKLWPSVAFAEGSSSEEELEAECFPVKPHNKSF 

AHTT FT VTCFfT^^PPnOT ROT? OA A ^Wf?V A fiP P A P 

GGRPPAATSPAEVVWSSEKTPKADQDGRGDPGE 
GEEVAWEQAAVAFDAVLRFAERQPCFSAQEVG 
QLRALRAVFRSQQQVRRRRGALGAVVKVEALQ 
EGPGGCGATAQSPLPCSSTAGDN 


3149 


A 


132 


4125 


VAVMISTAPLYSGVHNWTSSDRIRMCGINEERRA 

PLSDEESTTGDCQHFGSQEFCVSSSFSKVELTAV 

GSGSNARGADPDGSATEKLGHKSEDKPDDPQPK 

MDYAGNVAEAEGLLVPLSSPGDGLKLPASDSAE 

ASNSRADCSWTPLNTQMSKQVDCSPAGVKALDS 

RQGVGEKNTFILATLGTGWVEGTLPLVTTNFSP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence . 


Amino acid sequence (A«Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phcnylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K«Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Proline,Q=GIu famine, R=Argininc, S^Serinc, 
T=Threonine, V=Valinc, W=Tryptophan, Y«Tyrosine, 
X=Un known, *=Stop codon, /=possihle nucleotide deletion, 
^possible nucleotide insertion 


- 








LPAPICPPAPSSASVPHSVPDAFQAPVPPSAPTLVL 

APVPTPVLAPMPASTPPAAPAPPSVPMPTPTPSSG 

PPSTPTLIPAFAPTPVPAPTPAPIFTPAPTPMPAATP 

AACPTSAPIPASFSLSRVCFPAAQAPAMQKVPLSF 

QPGTVLTPSQPLVYIPPPSCGQPLSVATLPTTLGV 

SSTLTLPVLPSYLQDRCLPGVLASPELRSYPYAFS 

VARPLTSDSKLVSLEVNRLPCTSPSGSTTTQPAPD . 

GVPGPLADTSLVTASAKVLPTPQPLLPAPSGSSAP 

PHPAKMPSGTEQQTEGTSVTFSPLKSPPQLEREM 

ASPPECSEMPLDLSSKSNRQKLPLPNQRKTPPMP 

VLTPVHTSSKALLSTVLSRSQRTTQAAGGNVTSC 

LGSTSSPFVIFPEIVRNGDPSTWKNSTALISTIPG 

TYVGVANPVPASLLLNKDPNLGLNRDPRHLPKQ 

EPISIIDQGEPKGTGATCGKKGSQAGAEGQPSTV 

KRYTPARIAPGLPGCQTKELSLWKPTGPANIYPR 

CSVNGKPTSTQVLPVGWSPYHQASLLSIGISSAG 

QLTPSQGAPIRPTSWSEFSGVPSLSSSEAVHGLP 

EGQPRPGGSFVPEQDPVTKNKTCRIAAKPYEEQV 

NPVLLTLSPQTGTLALSVQPSGGDIRMNQGPEES 

ESHLCSDSTPKMEGPQGACGLKLAGDTKPKNQV 

LATYMSHELVLATPQNLPKMPELPLLPHDSHPKE 

LILDVVPSSRRGSSTERPQLGSQVDLGRVKMEKV 

DGDWFNLATCFRADGLPVAPQRGQAEVRAKA 

GQARVKQESVGVFACKNKWQPDDVTESLPPKK 

MKCGKEKDSEEQQLQPQAKAWRSSHRPKCRK 

LPSDPQESTKKSPRGASDSGBCEHNGVRGKHKHR 

KPTI<TESQSPGKRADSHEEGSLEKKAKSSFRDF1P 

VVLSTRTRSQSDLKARKQKTSSSQSLEHRLRNRN 

LLLPNKVQGISDSPNGFLPNNLEEPACLENSEKPS 

OJVKJvv^jv 1 JSJrllVl/i 1 V oilJ^AlVOlVOlv Wo^^JV I KoJTJV 

SPTPVKPTEPCTPSKSRSASSEEASESPTARQIPPE 
ARRLIVNKNAGETLLQRAARLGYKDVVLYCLQK 
DSEDVNHRDNAGYTALHEACSRGWTDILNILLE 
HGA 


3150 


-A 


3 


2795 


SLRMHNLSILVRQIKFYYQETLQQLIMMSLPNVLI 

IGKOTFSEQGTEEVKKLLLLLLGCAVQCQICKEEF 

IERIQGLDFDTKAAVAAHIQEVTHNQENVFDLQ 

WMEVTDMSQEDIEPLLKNMALHLKRLIDERDEH 

SETIEELSEERDGLHFLPHASSSAQSPCGSPGMKR 

TESRQHLSVELADAKABORRLRQELEEKTEQLLD 

CKQELEQMEIELKRLQQENMNLLSDARSARMYR 

DELDALREKAVRVDKLESEVSRYKERLHDEEFY 

KARVEELKBDNQXHLLETKTMLEDQLEGTRARSD 

KLHELEK£NLQLKAKLHDMEMERDMDRKXIEE 

LMEENMTLEMAQKQSMDESLHLGWELEQISRTS 

ELSEAPQKSLGHEVNELTSSRLLKLEMENQSLTK 

TVEELRTTVDSVEGNASKILKMEKENQRLSKKV . 

EILENEIVQEKQSLQNCQNLSKDLMBCEKAQLEKT 

IETLRENSERQIKJLEQENEHLNQTVSSLRQRSQIS 

AEARVKDIEKENKJLHESIKETSSKLSKIEFEKRQI 

KXELErTyXEKGERAEELENELHHT FKFNFT T OK 

KITNLKITCEKIEALEQENSELERENRKLKKTLDS 

FKNLTFQLESLEKENSQLDEENLELRRNVESLKC 

ASMKMAQLQLENKELESEICEQLKKGLELLKASF 

KKTERLEVSYQGLDIENQRLQKTLENSNKKIQQL 
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SEQ H> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidinc, 
I-Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T«=Threonine, V*=Valine, W^Tryptophan, Y>=Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
V*possible nucleotide insertion 










ESELQDLEMENQTLQKNLEELKISSKRLEQLEKE 
NKSLEQETSQLEKDKKQLEKENKRLRQQAEIKD 

1 1 JLUJtilNlN ViVlOINLJDJMDlNJtv 1 J^olsJllOl I JMboU VKL/I1 

ELEKENKEL VKRATEDIKTL VTLREDL V SEKLKT 
QQMNNDLEKLTHELEKIGLNKERLLHDEQSTDD 
SRYKLLESKLESTLKKSLEIKEEKIAALEARLEES 
TNYNQQLRQELKTVKKK 


3151 


A- 


2 


2515 


. GFWLHLTLLGASLPAALG WMDPGTSRGPDVGV 
GESQAEEPRSFEVTRREGLSSHNELLASCGKICFC 
SRGSRCVLSRKTGEPECQCLEACRPSYVPVCGSD 
GRFYENHCKLHRAACLLGKR1TVIHSKDCFLKGD 
TCTMAGYARLKNVLLALQTRLQPLQEGDSRQDP 
ASQKRLLVESLFRDLDADGNGHLSSSELAQHVL 
KKQDLDEDLLGCSPGDLLRFDDYNSDSSLTLREF 
YMAFQWQLSI^PEDRVSVTWTVGLSTVLTCA 
VHGDLRPPIIWKRNGLTLNFLDLEDINDFGEDDS 
LYITKVTTIHMGNYTCHASGHEQLFQTHVLQVN 
VPPVIRVYPESQAQEPGVAASLRCHAEGIPMPRIT 
WLKNGVDVSTQMSKQLSLLANGSELHISSVRYE 
DTGAYTCIAKNEVGVDEDISSLFIEDSARKTLANI 
LWREEGLSVGNMFYVFSDDGIIVIHPVDCEIQRH 
LKPTEKIFMSYEEICPQREKNATQPCQWVSAVNV 
RNRYIYVAQPALSRVLVVDIQAHKVLQSIGVDPL 
PAKLSYDKSHDQVWVLSWGDVHKSRPSLQVITE 
ASTGQSQHLIRTPFAGVDDFFIPPTOLIINHIRFGFI 
FNKSDPAVHKVDLETMMPLKTIGLHHHGCVPQA 
MAHTHLGGYFFIQCRQDSPASAARQLLVDSVTD . 

oVJUjrrlNOJJ V lLrl rrl 1 orLiOKrl VoAAAUor WL.rlV 

QEITVRGEIQTLYDLQINSGISDLAFQRSFTESNQ 
YNIYAALHTEPDLLFLELSTGKVGMLKNLKEPPA 
GPAQPWGGTHRIMRDSGLFGQYLLTPARESLFLI 
NGRQNTLRCEVSGIKGGTTVVWVGEV 


3152 


A 


1 


2645 


GAGWQVSLTGRWSPGREAGAGEVRQDPGSTAA 
SPSSCDADLSARMARGERRRRAVPAEGVRTAER 
AARGGPGRRDGRGGGPRSTAGGVALAVWLSL 
ALGMSGRWVLAWYRARRAVTLHSAPAVLPADS 
SSPAVAPDLFWGTYRPHVYFGMKTRSPKPLLTG 
LMWAQQGTTPGTPKLRHTCEQGDGVGPYGWEF 
HDGLSFGRQHIQDGALRLTTEFVKRPGGQHGGD 
WSWRVTVEPQDSGTSALPLVSLFFYWTDGKEV 
LLPEVGAKGQLKFISGHTSELGDFRFTLLPP.TSPG 
DTAPKYGSYNVFWTSNPGLPLLTEMVKSRLNSW 
FQHRPPGASPERYLGLPGSLKWEDRGPSGQGQG 
QFLIQQVTLKIPISIEFVFESGSAQAGGNQALPRLA 
GSLLTQALESHAEGFRERFEKTFQLKEKGLSSGE 
. QVLGQAALSGLLGGIGYFYGQGLVLPDIGVEGSE 
QKVDPALFPPVPLFTAVPSRSFFPRGFLWDEGFH 
QLVVQRWDPSLTREALGHWLGLLNADGWIGRE 
QILGDEARARVPPEFLVQRAVHANPPTLLLPVAH 
MLEVGDPDDLAFLRKALPRLHAWFSWLHQSQA 

ASHPSVTERHLDLRCWVALGARVLTRLAEHLGE 
AEVAAELGPLAASLEAAESLDELHWAPELGVFA 
DFGNHTKAVQLIsTRPPQGLVRVVGRPQPQLQYV 
DALGYVSLFPLLLRLLDPTSSRLGPLLD1LADSRH 
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SEQH> 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine OCystcinc, D^Aspar tic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleiicine, K=Lysine, L=Leucine, M=Methionine, 
N^Asparagine, FVProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V«=VaIine, W«=Tryptophan, Y«=Tyrosine, 
X^Unknown, *=Stop codon, /possible nucleotide deletion, 
V=possible nucleotide insertion 










LWSPFGLRSLAASSSFYGQRNSEHDPPYWRGAV 
WLNVNYLALGALHHYGHLEGPHQAEAAKLHGE 
LRANWGNVWRQYQATGFLWEQYSDRDGRGM 
GCRPFHGWTSLVLLAMAEDY 


3153 


A 


1 


4312 


MVIKTDELPAAAPADSAREHGSQAGGKGRPGAA 
AVLLADLERDARQGECALPGAAMAGLAPLKPE 
ASRSSSPGPTGCIRARVAAEAGTRNPGNAGAELE 
SWLPCCHGHPETPEPRGGQLPTAPELPSVMLLNG 
DCPESLKICEAAAAEPPRENGLDEAGPGDETTGQ ' 
EVIVIQDTGFSVKILAPGIEPFSLQVSPQEMVQEIH 
QVLMDREDTCHRTCFSLHLDGNVLDHFSELRSV 
EGLQEGSVLRVVEEPYTVREARIHVRHVRDLLKS 
LDPSDAFNGVDCNSLSFLSVFTDGDLGDSGKRK 
KGLEMDPIDCTPPEYILPGSRERPLCPLQPQNRD 
WKPLQCLKVLTMSGWNPPPGNRKMHGDLMYLF 
VITAEDRQVSITASTRGFYLNQSTAYHFNPKPASP 
' RFLSHSLVELLNQISPTFKKNFA VLQKKRVQRHP 
FERIATPFQVYSWTAPQAEHAMDCVRAEDAYTS 
RLGYEEHIPGQTRDWNEELQTTRELPRKNLPERL 
LRERAIFKVHSDFTAAATRGAMAVIDGNVMAIN 
PSEETKMQMFIWNNIFFSLGFDVRDHYKDFGGD 
VAAYVAPTNDLNGVRTYNAVDVEGLYTLGTVV 
VDYRGYRVTAQSIIPGILERDQEQSVIYGSIDFGK 
TVVSHPRYLELLERTSRPLKILRHQVLNDRDEEV 
ELCSSVECKGIIGNDGRHY1LDLLRTFPPDLNFLP 
VPGEELPEECARAGFPRAHRHKLCCLRQELVDA 
FVEHRYLLFMKLAALQLMQQNASQLETPSSLEN 
GGPSSLESKSEDPPGQEAGSEEEGSSASGLAKVK 
ELAETIAADDGTDPRSREVIRNACKAVGSISSTAF 
DIRFNPDIFSPGVRFPESCQDEVRDQKQLLKDAA 
AFLLSCQIPGLVKDCMEHAVLPVDGATLAEVMR 
QRGINMRYLGKVLELVLRSPARHQLDHVFKIGIG 
ELITRSAKKFKTYLQGVELSGLSAAISHFLNCFLS 
SYPNPVAHLPADELVSKKRNKRRXNRPPGAADN 
TAWAVMTPQELWKNICQEAKNYFDFDLECETV 
DQAVETYGLQKITLLREISLKTGIQVLLKEYSFDS 
mCPAFTEEDVLNIFPVVKHVNPKASDAFHFFQS 
GQAKVQQGFLKEGCELINEALNLFNNVYGAMH 
VETCACLRLLARLHYIMGDYAEALSNQQKAVL 
MSERVMGTEHPNTIQEYMHLALYCFASSQLSTA 
LSLLYRARYLMLLVFGEDHPEMALLDNNIGLVL 
HGVMEYDLSLRFLENALAVSTKYHGPKALKVAL 

ijix±x±-i v nj\ v l JiiJivrvjDr Ivo r\ i ri r, is. ha i x III JV 1 UL 

GEDHEKTKESSEYLKCLTQQAVALQRTMNEIYR 
NGSSANlPPIJm'APSMASVLEQLNVINGILFIPLS 
QKDLENLKAEVARRHQLQEASRNRDRAEEPMA 
TEPAPAGAPGDLGSQPPAAKDPSPSVQG 


3154 


A 


416 


4082 


KFKLIKIMLLTLIILLPVVSKFSFVSLSAPQHWSCP 

EGTLAGNGNSTCVGPAPFLIFSHGNSIFRIDTEGT 

NYEQLV VDAGVS VIMDFHYNEKRIY WVDLERQ 

LLQRWLNGSRQERVCNIEKNVSGMAINWINEEV 

IWSNQQEGIITVTDMKGNNSHILLSALKYPANVA 

VDPVERFIFWSSEVAGSLYRADLDGVGVKALLE 

TSEKITAVSLDVLDKRLFWIQYNREGSNSLICSCD 

YDGGSVHISKHPTQHNLFAMSLFGDRIFYSTWK 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 

nucleotide 

location 

lurrcapuiiuiiig 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine OCystcine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenyIaJanine, G=GIycine, H=Histidine, 
I=Isoleucine, K=Lysine, l>=Leucine, M^Methionine, 
N—Asparaginc, P^Proline, Q^Glutamine, R m Arginine, S^Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y-Tyrosinc, 
X=Unknown, **=Stop codon, /-possible nucleotide deletion, 
^possible nucleotide insertion 




• 






MKTIWIANKHTGKDMVRDsT^HSSFVPLGELKVV 

HPLAQPKAEDDTWEPEQKLCKLRKGNCSSTVCG 

QDLQSHLCMCAEGYALSRDRKYGEGNDWKYCE 

DVNECAFWNHGGTLGCKNTPGSYYCTCPVGFVL 

LPDGKRCHQLVSCPRNVSECSHDC\^TSEGPLCF 

CPEGSVLERDGKTCSGCSSPDNGGCSQLCVPLSP 

VSWECDCFPGYDLQLDEKSCAASGPQPFLLFANS 

QDIRHMHFDGTDYGTLLSQQMGMVYALDHDPV 

ENKJYF A HT ALK WIERANMD G SQRERLIEEG VD 

VPEGLAVDWIGRRFYWTDRGKSLIGRSDLNGKR 

SKIITIENISQPRGIAVHPMAKRLFWTDTGIKPRIE 

SSSLQGLGRLVIASSDLIWPSGITIDFLTDKLYWC 

DAKQSVIEMANLDGSKRRRLTQNDVGHPFAVA 

VFEDYVWFSDWAMPSVIRVNKRTGKDRVRLQG 

SMLKPSSL V V VHPLAKPGADPCLYQNGGCEHIC 

KKRLGTAWCSCREGFMKASDGKTCLALDGHQL 

LAGGEVDLKNQVTPLDILSKTRVSEDNTTESQHM 

LVAEIMVSDQDDCAPVGCSMYARCISEGEDATC 

QCLKGFAGDGKLCSDIDECEMGVPVCPPASSKCI 

NTEGGYVCRCSEGYQGDGIHCLDIDECQLGVHS 

CGENASCTNTEGGYTCMCAGRLSEPGLICPDSTP 

PPHLREDDHHYSVRNSDSECPLSHDGYCLPIDGV 

CMYIEALDKYACNCVVGYIGERCQYRDLKWWE 

LRHAGHGQQQKVIWAVCVWLVMLLLLSLWG 

AHYYRTOKI T ^KMPK>JPVFF9^rcnVP<sR'RPATYr 

EDGMSSCPQPWFVVIKEHQDLKNGGQPVAGED 
GQAADGSMQPTSWRQEPQLCGMGTEQGCWIPV 
SSDKGSCPQVMERSFHMPSYGTQTLEGGVEKPH 
SLLSANPLWOORALDPPHOMELTO 


3155 


A 


533 


212 


GTSGWYWERLAERRGRLWSREEAMATMENKVI 
CALVL VSMLALGTLAEAQTETCTVAPRERQNCG 
FPGVTPSQCANKGCCFDDTVRGVPWCTYPNTID 
VPPEEECEF 


3156 


A 


2 


1585 


PRVRAADVAAGAQAWSAGMAKSNGENGPRAP 

AAGESLSGTRESLAQGPDAATTDELSSLGSDSEA 

NGF AERRIDKFGFIVGSQG AEGALEEVPLEVLRQ 

RESKWLDN1LNNWDKWMAKKHKKIRLRCQKGI 

PPSLRGRAWQYLSGGKVKLQQNPGKFDELDMSP 

GDPKWLDVIERDLHRQl^FHEMFVSRGGHGQQD 

LFRVLKAYTLYRPEEGYCQAQAPIAAVLLMHMP 

AEQAFWCLVQICEKYLPGYYSEKLEAIQLDGEIL 

FSLLQKVSPVAHKHLSRQKIDPLLYMTEWFMCA 

FSRTLPWSSVLRVWDMFFCEGVKIIFRVGLVLLK 

HALGSPEKVKACQGQYETEERLRSLSPKIMQEAF 

LVOEVVELPVTFROTFRFT4T T OT R"R WOFTR HFT O 

CRSPPRLHGAKAILDAEPGPRPALQPSPSIRLPLD 
APLPGSKAKPKPPKQAQKEQRKQMKGRGQLEB3> 
PAPNQAMVVAAAGDACPPQHVPPKDSAPKDSAP 
QDLAPQVSAHHRSQESLTSQESEDTYL 


3157 


A 


3 


601 


SSAMGSRSSHAAVIPDGDSIRRETGFSQASLLRLH 

rOlFRALDRNKKGYLSR]V^^ 

IIESFFPDGSQRVDFPGFVRVLAHFRPVEDEDTET 

QDPKKPEPLNSRRI^KLHYAFQLYDLDRDGKISR 

ffiMLQVLI^lvrVGVQVTEEQLENflADRTVQEAD 

EDGDGAVSFYE1TXSLEKMDVEHKMSIRILK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine OCystcinc, D^Aspartic Acid, 
E=Glutamic Acid, F=Phcnylalanine, G^GIycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, (^Methionine, 

napat ugnicj r^rruiiiic, v^=v*iuiHiiiincf i\ = /\rginin£, oennc, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=possibie nucleotide deletion, 
\=possibk nucleotide insertion 


3158 


A 


2 


409 


ISSCPHTAYEGSMSTLSNFTQTLEDWRRIFITYM 
DNWRQNTTAEQEALQAKVDAENFYYVILYLMV 
MIGMFSFIIVAILVSTVKSKRREHSNDPYHQYIVE 
DWQEKYKSQILNLEESKATIHENIGAAGFKMSP 


3159 


A 


3 


416 


P WG AAELDMGRRD AOLL AALL VLGLC A LA G SF 

KPSPCQCSRLSPHNRTNCGFPGITSDQCFDNGCCF 

DSSVTGVPWCFHPLPKQESDQCVMEVSDRRNCG 

YPGISPEECASRKCCFSNFIFEVPWCFFPKSVEDC 

HY 


3160 


A 


179 


409 


KPKTBLILKMWYPELFVWVSQEPFPNKDMEGRL 
PKGRLPVPKEVNRKKNDETNAASLTPLGSSELRS 
PRISYLHFF 


3161 


A 


683 


1186 . 


LSSTGGLHAAACAAAMSLVIPEKFQHILRVLNTN 

IDGRJOQAFAITAIKGVGRRYAHVVT .RKADTDT T 

KRAGELTEDEVERVITIMQNPRQYKIPDWFLNRQ 

KDVKDGKYSQVLANGLDNKLREDLERLKKIRA 

HRGLRHFWGLRVRGQHTKTTGRRGRTVGVSKK 

K 


3162 


A 


1 


1938 


GMPRSRGGRAAPGPPPPPPPPGQAPRWSRWRVP 

GRLLLLLLPALCCLPGAARAAAAAAGAGNRAA 

VAVAVARADEAEAPFAGQNWLKSYGYLLPYDS 

RASALHSAKALQSAVSTMQQFYGIPVTGVLDQT . 

TIEWMKKPRCGVPDHPHLSRRRRNKRYALTGQK 

WRQKHITYSIHNYTPKVGELDTRKAIRQAFDVW 

QKVTPLTFEEVPYHEIKSDRKEADIMIFFASGFHG 

DSSPFDGEGGFLAHAYFPGPGIGGDTHFDSDEPW 

TLGNANHDGNDLFLVA VHELGHALGLEHSSDPS 

AIMAPFYQYMETHNFKLPQDDLQGIQKIYGPPAE 

PLEPTRPLPTLPVRRIHSPSERKHERQPRPPRPPLG 

DRPSTPGTKPNICDGNFNTVALFRGEMFVFKDR 

WFWRLRNNRVQEGYPMQIEQFWKGLPARIDAA 

YERADGRFVFFKGDKYWVFKEVTVEPGYPHSLG 

ELGSCLPREGIDTALRWEPVGKTYFFKGERYWR 

YSEERRATDPG YPKPITVWKGIPO APOG AFISKE 

GYYTYTYKGRDYWKFDNQKLSVEPGYPRNILRD 

WMGCNQKEVERRKERRLPQDDVDIMVTINDVP 

GSVNAVAVVIPCILSLCILVLVYTIFQFKNKTGPQ 

PVTYYKRPVQEWV 


3163 


A 


1235 


2223. 


SRLSLQFYVSFRRTGLFTCKLIVEIFFR>rmNDSL 

RTNT^VRFQPEmCACIYLAARALQIPLPTRPHW 

FLLFGTTEEEIQEICIETLRLYTRKKPNYELLEKEV 

EKRKVALQEAKLKAKGLNPDGTPALSTLGGFSP 

ASKPSSPREVKAEEKSPISESTV^TVKKEPEDRQQA 

SKSPYNGVRKDSKRSRNSRSASRSRSRTRSRSRS 

HTPRRHYNNRRSRSGTYSSRSRSRSRSHSESPRR 

HHNHGSPHLKAKHTRD^ 

RSQSKSl^HSDAAKXIiRHERGHHRDRRERSRSF 
ERSHKSKHHGGSRSGHGRHRR 


3164 


A 


3 


3274 


DCRLQAAMPTNFTVVPVEAHADGGGDETAERT 

EAPGTPEGPEPERPSPGDGNPRENSPFLNNVEVE 

QESFFEGK>MALFEEEMDSNPMVSSLLNK1ANY 

TNLSQGWEHEEDEESRRREAKAPRMGTFIGVY 

LPCLQNILGVDLFLRLTWIVGVAGVLESFLIVAMC 

CTCTMLTAISMSAIATNGWPAGGSYYMISRSLG 

PEFGGAVGLCFYLGTTFAGAMYILGTIEIFLTYISP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locution 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine 0=Cysteine, D=Aspartic Acid, 
E^Glutamic Acid, F=Phenylalanine, G*=Glycine, H=Histidine, 
I^lsoleucine, K^Lysine, L=Lcucine, M=Methionine, 
N— Asparagine, P=Proline, Q=Giutamine, R=Argininc, S^Scrine, 
T=Threoninc, V^Valine, W=Tryptophan, Y-Tyrosine, 
X^Unknown, *«Stop codon, ^possible nucleotide deletion, 
V=possible nucleotide insertion 










GAA1FQAEAAGGEAAAMLHNMRVYGTCTLVLM 

ALVVFVGVKYVNKLALVFLACVVLSILAIYAGVI 

KSAFDPPDIPVCLLGNRTLSRRSFDACVKAYGIH 

NNSATSALWGLFCNGSQPSAACDEYFIQNNVTEI 

QGIPGAASGVFLENLWSTYAHAGAFVEKKGVPS 

VPVAEESRASTLPYVLTDIAASFTLLVGIYFPSVT 

GMAGSNRSGDLKDAQKSPTGTILAIVTTSFIYLS 

CIVLFGAGIEGVVLRDKFGEALQGNLVIGMLAW 

PSPWVIVIGSFFSTCGAGLQTLTGAPRLLQAIARD 

GIVPFLQVFGHGKANGEPTWALLLTVLICETGILI 

ASLDSVAPELSMFFLMCYLFVNLACAVQTLLRTP 

NWRPRFKFYHWTLSFLGMSLCLALMFICSWYYA 

LSAMLIAGCIYKYIEYRGAEKEWGDGIRGLSLNA 

ARYALLRVEHGPPHTKNWRPQVLVMLNLDAEQ 

AMKHPRLLSFTSQLKAGKGLT1VGSVLEGTYLD 

KHMEAQRAEENIRSLMSTEKTKGFCQLVVSSSLR 

DGMSHLIQSAGLGGLKHNTVLMAWPASWKQED 

NPFSWKNFVDTVRDTTAAHQALLVAKNVDSFPQ 

NQERFGGGHmVWWIVHDGGMLMLLPFLLRQH 

KVWRKCRMRIFTVAQVDDNSIQMKKDLQMFLY 

HLRJSAEVEVVEMVENDISAFTYERTLMMEQRS 

OMT KOMOT ^KTsTFOFRFAOT TTTm?MTA QHTA A A 

ARTQAPPTPDKVQMTWTREKLIAEKYRSRDTSL 
SGFKDLFSMKPDQSNVRRMHTAVKLNGVVLNK 
SQDAQLVLLNMPGPPKNRQGDENYMEFLEVLTE 
GLNRVLLVRGGGREVITIYS 


3165 


A 


3 


2681 


GRGARGGSGAGALRGCRGYLQKLSGKGPSRGY 

RSRWFVFDARRCYLYYFKSPQDALPLGHLDIAD 

ACFSYQGPDEAAEPGTEPPAHFQVHSAGAVTVL 

KAPNRQLMTYWLQELQQKRWEYCNSLDMVKW 

DSRTSPTPGDFPKGLVARDNTDLIYPHPNASAEK 

AR3WLAVETWGELVGEQAANQPAPGHPNSINF 

YSLKQWGNELKNSMSSFRPGRGHNDSRRTVFYT 

NEEWELLDPTPKDLEESIVQEEKKKLTPEGNKGV 

TGSGFPFDFGRNPYKGKRPLKJDIIGSYKNRHSSG 

DPSSEGTSGSGSVSIRKPASEMQLQVQSQQEELE 

QLKKDLSSQBCELVRLLQQTVRSSQYDKYFTSSRL 

CEGVPKDTLELLHQKDDQILGLTSQLERFSLEKE 

SLQQEVRTLKSKVGELNEQLGMLMETIQAKDEV 

IIKLSEGEGNGPPPTVAPS SPS V VPV ARDQLELDR 

LKDNLQGYKTQNKFLNKEILELSALRRNPERRER 

DLMARNSSLEAKLCQIESKYLELLQEMKTPVCSE 

DQGPTREVIAQLLEDALQVESQEQPEQAFVKPHL 

VSEYDIYGFRTVPEDDEEEKLVAKVRALDLKTL 

YLTENQEVSTGVKWENYFASTVNREMMCSPEL 

KNLIRAGIPHEHRSKV WK WC VDRHTRKFKDNTE 

PGOTQTLLQKALEKQNPASKQIELDLLRTLPNNK 

HYSCPTSEGIQKLRNVLLAFSWRNPDIGYCQGLN 

T>T VAVAT T YT FOFDAFWCT VTTVF VFMPR DYVT 
KTLLGSQVDQRVFRDLMSEKLPRLHGHFEQYKV 
DYTLITFNWFLWFVDSVVSDILFKIWDSFLYEGP 
KVIFRFALALFKYKEEEILKLQDSMSIFKYLRYFT 
RTELDARSGTDAPTTWRKSGWS 


3166 


A 


10 


4070 


FPGPTISSNSQLYRASALFETIRHEAQLSTDYKLS 
LFDLQTSSYQALQRVLVSLGHHDEALAVAERGR 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
EK^lutamic Acid, ^Phenylalanine, G=Glycinc, H=Histidine, 
I =Iso leucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparaginc, P=Prolinc, Q=Glutaminc, R=Arginine, S=Serine, 
T=Threoninc, V«=Valinc, W=Tryptophan, V=Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










trafadllverqtgqqdsdpyspvitdqilemvn 
gqrglvlyyslaagylyswllapgagivkfheh 
ylgentvenssdfqasssvtlptatgsaleqhias 
vrealgveshysracasseteseagdimdqqfee 
mnnklnsvtdptgflrmvrrl^ 
lfsntvsptqdgtsslprrqssfakpplralydll 
iapmegglmhssgpvgrhrqlilvlegelylipf 
allkgsssneylyerfgllavpsirslsvqskshl 
rknpptyssstsmaavignpklpsavmdrwlwg 
pmpsaeeeaymvsellgcqplvgsvatkervms 
altqaecvhfatbiswklsalvltpsmdgnpass 
kssfghpyhpeslrvqddasdgesisdcpplqel 
lltaadvldlqlpvklwlgssqesnskvaadg 
vialtrafl aag aqc vlv sl wpvp v a afkmfih 
afyssllnglkasaalgeAmkvvqsskafshps 
nwagfmligsdvklnspssligqalteilqhper 
ardalrvllhlvekslqriqngqrnamytsqqs 
venkvggipgwqalltavgfrldpptsglpaav 
ffptsdpgdrlqqcsstlqsllglpnpalqalck 
litasetgeqlisravknmvgmlhqvlvqlqag 
ekeqdlasapiqvsisvqlwrlpgcheflaalgf 
vlcevgqeevilktgkqanrrtvhfalqsllslf 
dstelpbcrlsldssssleslasaqsvsnalplgyq 
qppfsptgadsiasdaisvyslssiassmsfvskpe 
ggsegggpggrqdhdrsknaylqrstlprsqlp 
: pqtrpagnkdeeeyegfsiisneplatyqenrntc 
fspdhkqpqpgtaggmrvsvsskgsistpnspvk 
mtlipspnspfqkvgklassdtgesdqsstetdst 
vksqeesnpkldpqelaqkileetqshliaverlq 
rsggqvsksnnpedgvqapsstavfrasetsafs 
rpvlshqksqpspvtvkpkpparssslpkvssgys 
spttsemsekdspsqhsgrpspgcdsqtsqldqpl - 
fklkypsspysahisicspknmspssghqspagsap 

CD AT CVCC A f"IC AT? CCD A "P» A T>"P»TTMi r T VKvf A ATTlCt/l/ 

qavhnlkmfwqstpqhstgpmkifrgapgtmts 
krdvlsllnlsprpnkkeegvdklelkelslqqh 
dgappkappnghwrtettslgslplpagppatap 
arplrlpsgngykflspgrffpsskc 


3167 


A 


1 


.762 


aarrrqkgkeenmmmdlfetgsyffyldgenv 
tlqplevaegsplypgsdgtlspcqdqmppeags 
dssgeehvlappglqpphcpgqcliwacktckrk 

C A DTPiP P V A A TT T> T?"D r> r> r V'VTVTC A rt? A T VT5 r> T\ T K 

oAr 1 IJivrUVAA 1 JLKJdJKJKJKJ^JsJsJIN JbAr bALKKK 1 V A 

NPNQRLPKVEILRSAISYIERLQDLLHRLDQQEK 
MQELGVDPFSYRPKQENLEGADFLRTCSSQWPS 
VSDHSRGLVITAKEGGASIDSSASSSLRCLSSrVDS 
ISSEERKLPCVEEWEK 


J l\JO 


A 


/\J 1 




loKtvV IMisxJNrr V 1 oJJitoJ^KisJvjlr^AroH 

KIMSSPLSKELRQKYNVRSMPIRKDDEVQVVRG 

HYKGQQIGKWQVYRKKYVIYIERVQREKaNGT 

TVHVGIHPSKVVITRLKLDKDRKKILERKAKSRQ 

VGKEKGKYKEELIEKMOE 


3169 


A 


156 


3168. 


GPGGAISLSVEAKAGADLLVKGKQARMDIYDTQ 
TLGVVVFGGFMVVSAIGIFLVSTFSMKETSYEEA 
LANQRKE^KTrfflQKVEKKKKEKTVEKKGKT 
KKKEEKPNGKIPDHDPAPNVTVLLREPVRAPAV 
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SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location, 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine C=Cystcine, D=Aspartic Acid, 
E-Glutamic Acid, F=Phenylalanine, G^Glycine, H=Histidine, 
I=Isoleucine, K«=Lysinc, L=Leucine, M=Methionine, 
N=Asparagine» P=Prolinc, Q=Glutaminc, R=Arginine, S=Serine t 
T^Threonine, V==Valine, W=Tryptophan, Y«=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possi ble nucleotide insertion 










AVAPTPVQPPIIVAPVATVPAMPQEKLASSPKDK 

KKKEBCKVAKVEPAVSSWNSIQVLTSKAAILETA 

PKEGRNTOVAQSPEAPKQEAPAKKKSGSKKKGP 

PDADGPLYLPYKTLVSTVGSMVFNEGEAQRLIEI 

LSEKAGnQDTWHKATQKGDPVADLKRQLEEKEK 

LLATEQEDAAVAKSKLRELNKEMAAEKAKAAA 

GEAKVKKQLVAREQEITAVQARMQASYREHVK 

EVQQLQGKIRTLQEQLENGPNTQLARLQQENSIL 

RDALNQATSQVESKQNAELAKLRQELSKVSKEL 

VEKSEA VRQDEQQRKALEAKAAAFEKQ VLQLQ 

ASHRESEEALQKRLDEVSRELCHTQSSHASLRAD 

AEKAQEQQQQMAELHSKLQSSEAEVRSKCEELS 

GLHGQLQEARAENSQLTERIRSIEALLEAGQARD 

AQDVQASQAEADQQQTRLKELESQVSGLEKEAI 

ELREAVEQQKVKNNDLREKNWKAMEALATAEQ 

ACKEKLHSLTQAKEESEKQLCLIEAQTMEALLAL 

LPELSVLAQQNYTEWLQDLKEKGPTLLKHPPAP 

AEPSSDLASKLREAEETQSTLQAECDQYRSILAET 

EGMLRDLQKSVEEEEQVWRAKVGAAEEELQKS 

RVTVKHLEEIVEKLKGELESSDQVREHTSHLEAE 

LEKHMAAASAECQNYAKEVAGLRQLLLESQSQL 

DAAKSEAQKQSDELALVRQQLSEMKSHVEDGDI 

AUAr Aoor tArr AJbV^JJr V l^Liv 1 v^Ub W 1 HA II .hi Jh 

QTQRQKLTAEFEEAQTSACRLQEELEKLRTAGPL 
ESSETEEASQLKERLEKEKKLTSDLGRAATRLQE 
LLKTTQEQLAREKDTVKKLQEQLEKAEDGSSSK 
EGTSV 


3170 


A 


6730 


4027 


THASEKYSYGHLPTHS1TAHPMVTIRISDRQRLIQ 

PYIHNYSWLLFAALALYSAHLASAEDVDGEKLD 

PQTRSSATTLRSQCMQLVGDCLMKAHQGKGLK 

ALALLGVLPDGDSSLEDHALPVTVPTGASEEQLE 

KKAVQGAELSEAGNGKRAVHEEIRPVDFKQRNK 

ADKGVSLSKJDPSCQTQISDSPADASPPTGLPDAE 

DSEVSSQKPIEEKAVTPSPEQVFAECSQKR1LGLL 

AAMLPPLKSGPTVPLIDLEHVLPLMFQVVISNAG 

HLNETYHLTLGLLGQLIIRLLPAEVDAAV1KVLSA 

KHNLFAAGD SSIVPDG WICTTHLLFSLGA VCLD S 

RVGLDWACSMAEELRSLNSAPLWRDVIATFTDH 

CIKQLPFQLKHTNIFTLLVLVGFPQVLCVGTRCV 

YMDNANEPHNVIILKHFTEKNRAVIVDVKTRKR 

KTVKDYQLVQKGGGQECGDSRAQLSQYSQHFA 

FIASHLLQSSMDSHCPEAVEATWVLSLALKGLY 

KTLKAHGFEEIRATFLQTDLLKLLVKXCSKGTGF 

SKTWLLRDLEILSMLYSSKKEINALAEHGDLEL 

DERGDREEEVERPVSSPGDPEQKKLDPLEGLDEP 

TRJCFLMAHDALNAPLHILRAIYELQMKKTDYFF 

LEVQKRFDGDELTTDERJRSLAQRWQPSKSLRLE 

EQSAKAVDTDMIILPCLSRPARCDQATAESNPVT 

QKLISSTESELQQSYAKQRRSKSAALLHKELNCK 

QVD A \n? nVT TTP\/"KTD A T A *\7T V A Dinn a ct t Anim 
olvKA V t\U I L»r K V IN fcA 1 A V JL Y AKri VLAoLLAhWP 

SHVPVSED]LELSGPAHMTYILDMFMQLEEKHE 
WEKWMQTELVLTHQVLPLPHRLPPVSASWSEA 
TCVAVQLPDRCECSKGRVTVSSPKDWASEELRG 
PERDFQLNQKALSPSSQFPSAEILRHIR 


3171 


A 


557 


89 


GTRAGPVKDREAFQRLNFLYQAAHCVLAQDPEN 
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SEQED 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 

dC(JUCIII»C 


Predicted end 

nucleotide 

location 

PArri>cnnni1ina 

iur ridpuiiuing 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phcnylalanine, G=Glycine, H=Histidine, 
I-Isoleucine, K^Lysine, L^Leucine, M=Methionine, 
N = Asparaginc, P— Proline, Q=Clutamine, R=Arginine, S == Serinc ) 
T=Threonine, V=VaIine, W=Tryptophan, Y«Tyrosine, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
V=possible nucleotide insertion 










QALARFY CYTERT1AKRLVLRRDPSVKRTLCRGC 
SSLLVPGLTCTQRQRRCRGQRWTVQTCLTCQRS 
QRFLNDPGHLLWGDRPEAQLGSQADSKJPLQPLP 
NTAHSISDRLPEEKMQTQGSSNQ 


3172 


A 


2 


496 


FRRAfiAGRrrRRRGFVTSPT 9PFPT A FORT AT<IRR 

PEPQTTQTVRSSALPAPPASPMSQYAPSPDFKRA 

LDSSPEANTEDDKTEEDVPMPKNYLWLTIVSCFC 

PAYPINIVALVFSEMSLNSYNDGDYEGARRLGRN 

AKWVAIASIIIGLLnGISCAVHFTRNA 


3173 


A 


2 


4048 


FRSGGCRRRAWTSRWPQRRRSPESCEAPLSAPL 

WGPQRGLPGREPLRSRSASAIALRTIGHILALLLR 

LLHLGLGSGGCREDVPPSGRGKKEEKMKKHRRA 

LALVSCLFLCSLVWLPSWRVCCKESSSASASSYY 

SQDDNCALENEDVQFQKKDEREGPINAESLGKS 

GSNLPISPKEHKLKDDSIVDVQNTESKKLSPPWE 

TLPTVDLHEESSNAVVDSETVENISSSSTSEITPIS 

KLDEIEKSGTIPIAKPSETEQSETDCDVGEALDAS 

APIEQPSFVSPPDSLVGQHIENVSSSHGKGKITKSE 

FESKVSASEQGGGDPKSALNASDNLKNESSDYT 

KPGDIDPTSVASPKDPEDIPTFDEWKKKVMEVEK 

EKSQSMHASSNGGSHATKKVQKNRNNYASVEC 

GAJJLAANPEAKSTSAILIEhMDLYMLNPCSTKI 

WFVIELCEPIQVKQLD1ANYELFSSTPKDFLVSISD 

RYPTNKWIKLGTFHGRDERNVQSFPLDEQMYAK 

YVKMFIKYIKVELLSHFGSEHFCPLSLIRVFGTSM 

VEEYEEIADSQYHSERQELFDEDYDYPLDYNTGE 

DKSSKNLLGSATNAILNMVNIAANILGAKTEDLT 

EGNKSISENATATAAPKMPESTPVSTPVPSPEYVT 

TEVHTHDMEPSTPDTPKESPIVQLVQEEEEEASPS . 

TVTLLGSGEQEDESSPWFESETQIFCSELrnCCIS 

SFSEY1YKWCSVRVALYRQRSRTALSKGKDYLV 

LAQPPLLLPAESVDVSVLQPLSGELENTNDEREAE 

TVVLGDLSSSMHQDDLVNHTVDAVELEPSHSQT 

LSQSLLLDITPEINPLPKIEVSESVEYEAGHIPSPVI 

PQESSVEIDNETEQKSESFSSIEKPSITYETNKVNE 

LMDNIIKEDVNSMQIFTKLSETIVPPINTATVPDN 

EDGEAKMNIADTAKQTLISWDSSSLPEVKEEEQ 

SPEDALLRGLQRTATDFYAELQNSTDLGYANGN 

LVHGSNQKESVFMRLNNRIKALEVNMSLSGRYL 

EELSQRYRKQMEEMQKAFNKT1VKLQNTSRIAE 

EQDQRQTEAIQLLQAQLTNMTQLVSNLSATVAE 

LKREVSDRQSYLVISLVLCWLGLMLCMQRCRN 

TSQFDGDYISKLPKSNQYPSPKRCFSSYDDMNLK 

RRTSFPLMRSKSLQLTGKEVDPNDLYTVEPLKFSP 

FKKKKRPlfYK'TFK'TFTrRrPFFPT UPTA\Tf;nn<:f;Rl<r 

PFTNQRDFSNMGEVYHSSYKGPPSEGSSETSSQS 
EESYFCGISACTSLCNGQSQKTKTEKRALKRRRS 
KVQDQGKLIKTLIQTKSGSLPSLHDDKGNKEITV 
GTFGVTAVSGHI 


3174 


A 


485 


4668 


RKCSKEKASKTPSQKIPTTPCCVLQAGPEPRSLAE 
RMGADGETX^VLKNMLIGVNLILLGSMIKPSECQL 
EVTTERVQRQSVEEEGGIANYNTSSKEQPVyFNH 
VYNINVPLDNLCSSGLEASAEQEVSAEDETLAEY 
MGQTSDHESQVTFTHRINFPKKACPCASSAQVLQ 
ELLSR1EMLEREVSVLRDQCNANCCQESAATGQL 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

nucleotide 

location 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteinc, D=Aspartic Acid, 
E=Glutamic Acid, F=PhcnylaIanine, G=Glycinc, H=Histidine, 
l=Isolcucine, K=Lysine, L=Lcucine, M=Mcthionine, 
ri— /vsparaginc, r— rronne, v! — oiuiaminc, K— Arginine, S=Serine, 
T=Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 




i 






DYIPHCSGHGNFSFESCGCICNEGWFGKNCSEPY 

CPLGCSSRGVCVDGQCICDSEYSGDDCSELRCPT 

DCSSRGLCVDGECVCEEPYTGEDCRELRCPGDCS 

GKGRCANGTCLCEEGYVGEDCGQRQCLNACSG 

RGQCEEGLCVCEEGYQGPDCSAVAPPEDLRVAG 

ISDRSIELEWDGPMAVTEYVISYQPTALGGLQLQ 

QRVPGDWSGVTITELEPGLTYNISVYAVISNILSL 

PITAKVATHLSTPQGLQFKTITETTVEVQWEPFSF 

SFDGWEISFIPKNNEGGVIAQVPSDVTSFNQTGLK 

PGEEYTVN WALKEQARSPPTSA SVSTVIDGPTQI 

LVRDVSDTVAFVEWIPPRAKVDFILLKYGLVGGE 

GGRTTFRLQPPLSQYSVQALRPGSRYEVSVSAVR 

GTNESDSATTQFTTEIDAPKNLRVGSRTATSLDL 

EWDNSEAEVQEYKWYI-ELAGEQYHEVLVPRGI 

GPTTRATLTDLVPGTEYGVGISAVMNSQQSVPAT 

MNARTELDSPRDLMVTASSETSISLIWTKASGPID 

HYRJTFTPSSGIASEVTVPKJDRTSYTLTDLEPGAE 

YIISVTAERGRQQSLESTVDAFTGFRPISHLHFSH 

VTSSSVNITWSDPSPPADRLILNYSPRDEEEEMME 

VSLDATKRHAVLMGLQPATEYIVNLVAVHGTVT 

SEPIVGSITTGIDPPKDITISNVTKDSVMVSWSPPV 

ASFDYYRVSYRPTQVGRLDSSWPNTVTEFTITR 

LNPATEYEISLNSVRC3REESER1CTLVHTAMDNP 

VDLIATNITPTEALLQWKAPVGEVENYVIVLTHF 

AVAGEmVDGVSEEFRLVDLLPSTHYTATMYAT 

NGPLTSGTISTNFSTLLDPPANLTASEVTRQSALIS 

WQPPRAEIENYVLTYKSTDGSRKELIVDAEDTWI 

RLEGLLENTDYTVLLQAAQDTTWSSITSTAFTTG 

GRVFPHPQDCAQHLMNGDTLSGVYPIFLNGELS 

QKLQVYCDMTTDGGGWIVFQRRQNGQTDFFRK 

WADYRVGFGNVEDEFWLGLDNIHRITSQGRYEL 

YNGTAGDSLSYHQGRPFSTEDRDNDVA VTNC A 
MSYKGAWWYKNCHRTTSTLNGKYGESRHSQGIN 
WYHWKGHEFSIPFVEMKMRPYNHRLMAGRKRQ 
SLQF 


3175 


A 


2 


623 


RLQLPACPALS AAHPLALPSFS SQCHRAEARAAA 
^TAEGTMASGVTVNDEVIKVFNDMKVRICSST 
OEFTKKRKKAVT FPT ^nriftTRrYTTVRPA VTtTT vnrtT 

v<«^J-r»-^-'^Jvrv-rv v i^r \^i^OL/LJI\J\\^xl V HH/i.iVV^li^ V \JLJL ■ 

GDTVEDP YTSF VKLLPLND CRY AL YD ATYETKE 
SKKEDL VFIF WAPES APLKSKMI YA S SKD AIKKK 
FTGIKHEWQVNGLDDIKDRSTLGEKLGGNVVVS 
LEGKPL 


3176 


A 


99 


1567 


PRGCWSSCLDAMFRLNSLSALAELAVGSRWYH 

GG SQPIQIRRRLMM V AFLG A S A VTASTGLL WKR 

AHAESPPCVDNLKSDIGDKGKNKDEGDVCNHEK 

KTADLAPHPEEKKKKRSGFRDRKVMEYENRIRA 

YSTPDKIFRYFATLKVISEPGEAEVFMTPEDFVRS 

ITPNEKQPEHLGLDQYIEKRFDGKTEKISQEREKF 

ADEGSIFYTLGECGLISFSDYIFLTTVLSTPQRNFE 

IAFKMFDLNGDGEVDMEEFEQVQSIIRSQTSMG 

MRHRDRPTTGNTLKSGLCSALTTYFFGADLKGK * 

LTIKNFLEFQRKLQHDVLKLEFERHDPVDGRITE 

RQFGGMLLAYSGVQSKKLTAMQRQLKKHFKEG 

KGLTFQEVENFFTFLKNINDVDTALSFYHMAGAS 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
lnratinn 

corresponding 
to first amino 
acid residue of 
pepude 
sequence 


Predicted end 

nucleotide 

location 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A= Ala nine G=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=PhenyIalanine, G=Grycine, H=Histidine, 
l~lsoleucine, K~Lysine, L=Leucine, M=Methionine, 

—Asparagine, r— rroiine, vp^-«uiamine, K— Argmine, S^Serine, 
T=Thrconinc, V^Valiric, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possi ble n u cl coti dc i nsertion 










LDKVTMQQVARTVAKVELSDHVCDVVFALFDC 
DGNGELSNKEFVSIMKQRLMRGLEKPKDMGFTR 
LMQAMWKCAQETAWDFALPKQ 


3177 


A 


182 


648 


LGVVGSGAAVGGROAARGAAT GftRPMAAVT n 
ALGATRRLLAALRGQSLGLAAMSSGTHRLTAEE 
RNQAILDLKAAGWSELSERDAIYKEFSFHNFNQA . 
FGFMSRVALQAEKMNHHPEWFWWKVQITLTS 
HDCGELTKKDVKLAKFIEKAAASV 


3178 


A 


8 


612 


ACGCRSFCGSTVMSLLLYYALPALGSYAMLSIFF 
LRRPHLLHTPRAPTFRIREGAHRGG^IGFT 1 FNJTM 

EAMENSMAQRSDLLELDCQLTRDRVVWSHDE 
NLCRQSGLNRDVGSLDFEDLPLYKEKLEVYFSPG 
HFAHGSDRRMVRLEDLFQRFPRTPMSVEIKGKN 
EEL1REIAGLVRRYDRNEITIWASEKSSVMKKCK 


3179 


A 


88 


1496 


QETSKMETLSFPRYNVAEIVIHIRNKILTGADGKN . 

LTKI^LYPNPKPEVLHMIYMRALQIVyGIRLEHF 

YMMPVNSEVMYPHLMEGFLPFSNLVTHLDSFLPI 

CRVNDFETADILCPKAKRTSRFLSGIINFIHFREAC 

RETYMEFLWQYKSSADKMQQLNAAHQEALMK 

LERLDS VPVEEQEEFKQLSDGIQELQQSLNQDFIi 

QKTIVLQEGNSQKKSNISEKTKRLNELKLSVVSL 

KEIQESLKTKIVDSPEKLKl<ryK£KMKDTVQKLK 

NARQEWEKYEIYGDSVDCLPSCQLEVQLYQKK 

IODLSDNREKLASTLKESLNI FDOTF<snF<sFr fcTTn 

KTEENSFKRLMIVKKEKLATAQFKINK1CHEDVK 
QYKRTVIEDa^KVQEKRGAVYERVTTINHEIQKI 
" RLGIQQLKDAADREKLKSQEIFLNLKTALEKYHD 
GIEKAAEDSYAKIDEKTAELKRKMFKMST 


3180 


A 


298 


7086 


GNMAC WPQLRLLL WKNLTFRRRQTCQLLLEVA 

WPLFIFLILISVRLSYPPYEQHECHFPNKAMPSAG 

TLPWVQGnCNANNPCFRYPTPGEAPGWGNFNK 

SIVARLFSDARRLLLYSQKDTSMKDMRKVLRTL 

QQIKKSSSNLKLQDFLVDNETFSGFLYHNLSLPK 

STVDICMLRADVILHKVFLQGYQLHLTSLCNGSK 

SEEMIQLGDQEVSELCGLPREKLAAAERVLRSN 

MDILKPILRTLNSTSPFPSKELAEATKTLLHSLGT 

LAQELFSMRSWSDMRQEVMFLTNVNSSSSSTQI 

YQAVSRIVCGHPEGGGLKIKSLNWYEDNNYKAL 

FGGNGTEEDAETFYDNSTTPYCNDLMKNLESSPL 

SRII WKALKPLLVGKIL YTPDTP ATRQ VMAEVNK 

TFQELAWHDLEGMWEELSPKIWTFMENSQEMD 

LVRMLLDSRDNDHFWEQQLDGLDWTAQDIVAF 

LAKHPEDVQSSNGSVYTWREAFNETNQAIRTISR 

FMECVNLNKLEPIATEVWLINKSMELLDERKFW 

AGIVFTGITPGSIELPHHVKYKIRMGIDNVERTNK 

IKDGYWDPGPRADPFEDMRYVWGGFAYLQDW 

EQAIIRVLTGTEKKTGVYMQQMPYPCYVDDIFLR 

VMSRSMPLFMTLAWIYSVAVIIKGIVYEKEARLK 

ETMRIMGLDNSBLWFSWFISSLIPLLVSAGLLWI 

LKLGNLLPYSDPSWFVFLSVFAVVTILQCFLIST. 

LFSRANLAAACGGIIYFTLYLPYVLCVAWQDYV 

GFTLKIFASLLSPVAFGFGCEYFALFEEQGIGVQW 

DNLFESPVEEDGFNLTTSVSMMLFDTFLYGVMT 

WYIEAVFPGQYGIPRPWYFPCTKSYWFGEESDEK 

SHPGSNQKMSEICMEEEPTHLKLGVSIQNLVKVY 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location - 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino, acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E«Glutamic Acid, F=Phcnylalaninc, C=Glycine, H-Histidine, 
I^lsoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop cod on, A=possibIe nucleotide deletion, 
\— possible nucleotide insertion 




> 






RDGMKVAVDGLALNFYEGQ1TSFLGHNGAGKTT 

TMSLLTGLFPPTSGTAYILGKDIRSEMSTIRQNLG 

VCPQHNVLFDMLTVEEHIWFYARLKGLSEKHVK 

AEMEQMALDVGLPSSKLKSKTSQLSGGMQRKLS 

VALAPVGGSKVVILDEPTAGVDPYSRRGIWELLL 

KYRQGRTIILSTHHMDEADX^GDRIAnSHGKLCC 

VGSSLFLKNQLGTGYYLTLVKKDVESSLSSCRNS 

SSTVSYLKKEDSVSQSSSDAGLGSDHESDTLTID 

VSAISNLIRKHVSEARLVEDIGHELTYVLPYEAA - 

KEGAFVELFHEIDDRLSDLGISSYGISETTLEEIFL 

KVAEESGVDAETSDGTLPARRNRRAFGDKQSCL 

RPFTEDDAADPNDSDIDPESRETDLLSGMDGKGS 

YQVKGWKLTQQQFVALLWKRLLLARRSRKGFF 

AQIVLPAVFVCIALVFSLIVPPFGKYPSLELQPWM 

YNEQYTFVSNDAPEDTGTLELLNALTKDPGFGT 

RCMEGNPIPDTPCQAGEEEWTTAPVPQTIMDLFQ 

NGNWTMQNPSPACQCSSDKIKKMLPVCPPGAGG 

LPPPQRKQNTADILQDLTGRMSDYLVKTYVQIIA 

KSLKNKIWVNEFRYGGFSLGVSNTQALPPSQEV 

NDATXQMKKHLKLAKDSSADRFLNSLGRFMTG 

LDTRNNVKVWFNNKGWHAISSFLNVINNAILRA 

NLQKGENPSHYG1TAFNHPLNLTKQQLSEVAPM 

TTSVDVLVSICVIFAMSFVPASFWFLIQERVSKA 

KHLQFISGVKPVIYWLSNFVWDMCNYVVPATLV 

IIIF1CFQQKSYV SSTNLPVL ALLLLLYG WSITPLM 

YPASFVFKIPSTAYWLTS VNLFIGINGS VATFVL 

ELFTDNKLNNINDILKSVFLIFPHFCLGRGLIDMV 

KNQAMADALERFGENRFVSPLSWDLVGRNLFA 

MAVEGVVFFLITVLIQYRFFIRPRPVNAKLSPLND 

EDEDVRRERQRILDGGGQNDILEIKELTKIYRRK 

RKPAVDRICVGIPPGECFGLLGVNGAGKSSTFKM 

LTGDTTVTRGDAFLNRNSILSNIHEVHQNMGYCP 

QFDAITELLTGREHVEFFALLRGVPEKEVGKVGE 

WAIRKLGLVKYGEKYAGNYSGGNKRKLSTAMA 

LIGGPPVVFLDEPTTGW)PKARRFLWNGALSVV 

KEGRSVVLTSHSMEECEALCTRMAIMVNGRFRC 

Lub V ^JtiLKJN Kr CjJLXj Y 1 1 V VR1A GSNPDLKrVQDF 

FGLAFPGSWKEKHR3^MLQYQLPSSLSSLARIFSI 

LSQSKKRLHIEDYSVSQTTLDQVFVNFAKDQSDD 

DHLKDLSLHKNQTVVDVAVLTSFLQDEKVKESY 

V . 


3181 


A 


215 


1367 


PPATSQAALPEALSKGRETPRPATHPARSQDVRP 

LSCPFDFLRDNVEWSEEQAAAAERKVQENSIQR 

VCQEKQVDYEINAHKYWNDFYKIHENGFFKDR 

HWLFTEFPELAPSQNQMDLKDWFLENICSEVPEC 

RNNEDGPGLIMEEQHKCSSKSLEHKTQTPPVEEN 

VTQKISDLEICADEFPGSSATYRILEVGCGVGNTV 

FPELQTNNDPGLFVYCCDFSSTAIELVQTNSEYDP 

VPDKMQKAINRLSRLLKPGGMVLLRDYGRYDM 
AOLRFKKGOCLSGlSn?YVRGDGTRVYFFTOFFT F) 
T1JTTAGLEKVQNLVDRRLQVNRGKQLTMYRV 
WIQCKYCKPLLSSTS 


3182 


A 


3 


1289 


GSETQI^PRDPQELPWDPQQHQDRRRPELFHAF 
ARD S APPP SMVL AAETTSQQERLQ AIAEKRKRQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location - 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteinc, D=Aspartic Acid, 
E=Glutamic Acid, F«Phcnylalanine, G=Glycine, H=Histidine, 
I=Isoteucine, K^Lysine, L=Leucine, M=Methionine, 
N«Asparagine, P=Proline, Q^Glutamine, R=Arginine, S=Serine, 
T=Threonine, V«VaIine, W=Tryptophan, Y^Tyrosine, 
X^Unknown, *=Stop codon, /=possible nucleotide deletion, 
t=possible nucleotide insertion 










AEIENKRRQLEDERRQLQHLKSKALRERWLLEG 

TPSSASEGDEDLRRQMQDDEQKTRLLEDSVSRLE 

KGIEVLERGDSAPAAAKENAAAPSPVRAPAPSPA 

KEERKTEVVMNSQQTPVGTPKDKRVSNTPLRtV 

DGSPMMKAAMYSVEITVEKDKVTGETRVLSSTT 

LLPRQPLPLGIKVYEDETKVVHAVDGTAENGIHP 

T gccu\/tm?t tuv a r^ci/Tr on a pct a r* a a dtd pat; 
JLoboiiVUJbJ^lilis^lJiiV lLocALrbl AuAAblRGAV 

EGAARTTPSRREITGVQAQPGEATSGPPGIQPGQE 

PPVTMEFMGYQNVEDEAETKKVLGLQDTTTAEL 

WIEDAAEPKEPAPPNGSAAEPPTEAASREENQA 

GPEATTSDPQDLDMKKHRCKCCSIM 


3183 


A 


333 


1931 


IAPTGGSHSEIQKQLGSGGDSSSQRRAERRTEPRS 

APRPR WGRS ARSPG AHKLPG PPRRRDPG A WARL 

EAAAAHRHSRGSMGRRMRGAAATAGLWLLAL 

GSLLALWGGLLPPRTELPASRPPEDRLPRRPARS 

GGPAPAPRFPLPPPLAWDARGGSLKTFRALLTLA 

AGADGPPRQSRSEPRWHVSARQPRPEESAAVHG 

GVFWSRGLEEQVPPGFSEAQAAAWLEAARGAR 

MVALERGGCGRSSNRLARFADGTRACVRYGINP 

EQIQGEALSYYLARLLGLQRHVPPLALARVEAR 

GAQWAQVQEELRAAHWTEGSVVSL*niWLPNLT 

DVWPAP.WRSEDGRLRPLRDAGGELANLSQAEL 

VDLVQWTDLILFDYLTANFDRLVSNLFSLQWDP 

KVMQKA 1 oNLHKOPOOALVFLDNEAGLVHGYK 

VAGMWDKYNEPLLQSVCVFRERTARRVLELHR . 

GQDAAARLLRLYRRHEPRFPELAALADPHAQLL 

QRRLDFLAKHILHCKAKYGRRSGDLVSPGGKER 

DLGLGYG 


3184 


A, 


1 


1004 


GSTHASADAWAQWFCTEALVMGAPVWYLVAA 

ALLVGFILFLTRSRGRAASAGQEPLHNEELAGAG 

RVAQPGPLEPEEPRAGGRPRRRRDLGSRLQAQR 

RAQRVAWAEADENEEEAVILAQEEEGVEKPAET 

HLSGKIGAKKLRKLEEKQARKAQREAEEAEREE 

KKKLbSQKbAb WKKJ3EERLRLEEEQKEEEERKA 

REEQAQREHEEYLKLKEAFWEEEGVGETMTEE 

QSQSFLTEFINYIKQSKVVLLEDLASQVGLRTQD 

TINRIQDLLAEGTITGVIDDRGKFIYITPEELAAVA 

-NFIRQRGRV SIAELAQASNSLIA WGRESPAQAP A 


3185 


A 


2981 


7173 


CLLAGKFSSTLYETGGCDMSLXWEPAARRASNI 
CDTDSHVSSSTSVRFYPHDVLSLPQIRLNRLLTID 
TDLLEQQDEDLSPDLAATYGPTEEAAQKVKHYY 
RFWILPQLMGINFDRLTLLALFDRNREILENVLA 
VILAn-VAFLGSILLIQGFFRDIWVFQFCLVIASCQ 
YSLLKSVQPDSSSPRHGHNRIIAYSRPVYFCICCG 
LIWLLDYGSRNLTATKFKLYGITFTNPLVFISARD 
LVIWTLCFPIVFFIGLLPQWTFVMYLCEQLDIHI 
FGGNATTSLLAALYSFICSIVAVALLYGLCYGAL 
KDSWDGQfflPVLFSIFCGLLVAVSYHLSRQSSDP 
SVLFSLVQSKIFPKTEEKNPEDPLSEVKDPLPEKL 
RNSVSERLQSDLVVCIVIGVLYFAI1WSTVFTVLQ 

PAT K"YVT YTT Vf^FVnPVTPTWT POVRT^HT PWH 
i r\JL»JCv I V JL» I I Ju V \Jr V VJT V Ifll V Lry V XvlvV^JL/Xr W fl 

CFSHPLLKTLEYNQYEVRNAATMMWFEKLHVW 
LLFVEKNIIYPLIVLNELSSSAETIASPKXLNTELG 
ALMWAGLKIXRSSFSSPTYQYVTVIFTVLFFKF 
DYEAFSETMLLDLFFMSH-FNKLWELLYKLQFVY 
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SEQJD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to Inst amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=AJanine OCystcinc, D=Aspartic Acid, 
E=GIutamic Acid, F=Phenylalanine, G^Glycine, H=Histidine, 
l=Iso!cucine, KHLysine, L=Lcucine, M-Mcthioninc, 
N=Asparagine, P=ProIine, Q=GIutamine, R=Arginine,S 2=: Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
V=possible nucleotide insertion 










TYIAPWQITWGSAFHAFAQPFAVPHSAMLF1QAA 

VSAFFSTPLNPFLGSAIFITSYVRPVKFWERDYNT 

KRVDHSNTRLASQLDRNPGTYCQQREVEAITEG 

VEEDEGFCCCEPGHIPHMLSFNAAFSQRWLAWE 

VIVTKYILEGYSITDNSAASMLQWDLRKVLTTY ' 

YVKGIIYYVTTSSKLEEWLANETMQEGLRLCAD 

RNYVDVDPTFNPNIDEDYDHRLAGISRESFCVIY 

LNWIEYCS SRKAKP VD VDKDSSLVTLC YGLC VL 

GRRALGTASHHMSSNLESFLYGLHALFKGDFRIS 

SIRDE WIF ADMELLRX W VPG IRMSTKLHQDHFT 

SPDEYDDPTVLYEAIVSHEKNLVIAHEGDPAWRS 

AVLANSPSLLALRHVMDDGTNEYKIIMLNRRYL 

SFRVIKVNKECVRGLWAGQQQELVFLRNRNPER 

GSIQNAKQALRNMINSSCDQPIGYPIFVSPLTTSY 

SDSHEQLKDILGGPISLGNIRNFIVSTWHRLRKGC 

GAGCNSGGNIEDSDTGGGTSCTGNNATTANNPH 

SNVTQGSIGNPGQGSGTGLHPPVTSYPPTLGTSHS 

SHSVQSGLVRQSPARASVASQSSYCYSSRHSSLR 

MSTTGFVPCRRSSTSQISLRNLPSSIQSRLSMVNQ 

MEPSGQSGLACVQHGLPSSSSSSQSIPACKHHTL 

VGFLATEGGQSSATDAQPGMTLSPANNSHSRKA 

EVIYRVQIVDPSQILEGINLSKRKELQWPDEGIRL 

KAGRNSWKDWSPQEGMEGHVIHRWVPCSRDPG 

TRSHIDKAVLLVQIDDKYVTV1ETGVLELGAEV 


3186 


A 


3 


470 


SLSAMRFLAATFLLLALSTAAQAEPVQFKDCGSV 

DGVIKEVNVSPCPTQPCQLSKGQSYSVNVTFTSN 

IQSKSSKAVVHGILMGVPVPFPIPEPDGCKSGINC 

PIQKDKTYSYLNKLPVKSEYPSIKLWEWQLQDD 

KNQSLFCWEIPVQIVSHL 


31b / 


A 

A 


3 


470 


SLSAMRFLAATFLLLALSTAAQAEPVQFKDCGSV 

DG\OKEVNVSPCPTQPCQLSKGQSYSVNVTFTSN 

IQSKSSKAWHGILMGVPVPFPIPEPDGCKSGINC 

PIQKDKTYSYLNKLPVKSEYPSIKLVVEWQLQDD 

KNQSLFCWEIPVQIVSHL 


3188 


A 


2 


3483 


PRVRTKL1LLVNDKKRYERVGGGPKRLG1U)VEM 

EEMIEQLQEKVHELEKQNDTLKNRLISAKQQLQT 

QGYRQTPYNNVQSRINTGRRKANENAGLQECPR 

KGn<LTQDAD V AETPOTMFTK YGNSLLEE ARGEIR 

iS^ElWIQSQRGQffiELEl^AEILKTQLRRKENEIE 

LSLLQLREQQATDQRSNDIDNVEMIKLHKQLVE 

KSNALSAMEGKFIQLQEKQRTLKISHDALMANG ' 

DELNMQLKEQRLKCCSLEKQLHSMKFSERRIEEL 

QDRINDLEKERELLKENYDKLYDSAFSAAHEEQ 

WKLICEQQLKVQIAQLETALKSDLTDKTEILDRL 

KTERDQNEKLVQENRELQLQYLEQKQQLDELKK 

RIKLYNQENDINADELSEALLLn<AQKEQKNGDL 

SFLVK\0DSED^LERSM1^LQATHAETVQELEK 

TRNMLMQHKTNICDYQMEVEAVTRKMENLQQD 

YELKVEQYVHLLDI^ARJDHKLEAQLKDIAY^ 

QYOKPE1MPDDSVDEFDETIHLERGENLFEIHIN 

T^VTF^WVT OA^rini^PPVTrTPTVA'PVTiTrPT HTTP 

VVRGLKPEYNrTSQYLVHVNDLFLQYIQKNTITL 
EVHQAYSTEYETIAACQLKFHEILEKSGRIFCTAS 
LIGTKGDIPNFGTVEYWFRLRVPMDQAIRLYRER 
AKALG YITSNFKGPEHMQSLSQQ APKTAQLS STD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Gfutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, M~Methionine, 
N=Asparagine, P^Prolinc, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










STDGNLNELHITIRCCNHLQSRASHLQPHPYVVY 

KFFDFADHDTAIIPSSNDPQFDDHMYFPVPMNM 

DLDRYLKSESLSFYVFDDSDTQENIYIGKVNVPLI 

SLArlDRCISGH^LTDHQKHPAGTIHVILKWKFA 

YLPPSGSITTEDLGNFIRSEEPEWQRLPPASSVST 

LVLAPRPKPRQRLTPVDKKVSFVDIMPHQSDVSQ 

EGSVDEVKENTEKMQQGKDDVSLLSEGQLAEQS 

LASSEDETEITEDLEPEVEEDMSASDSDDCIIPGPI 

SKNIKQPSEKIRIEIIALSLNDSQVTMDDTIQRLFV 

bCKr Y bLrAbb 1 r VbLrKPKDCjQ W VYYNYSNVIY 

VDKENNKAKRDILKAILQkQEMPNRSLRFTVVS 

DPPEDEQDLECEDIG VAHVDLADMFQEGRDLIE 

QNIDVFDARADGEGIGKLRVTVEALHALQSVYK 

QYRDDLEA 


3189 


A 


476 


1175 


MKGSGWHLRSGMVGTLITTILPHWRRTAHVGTN 

ILTAVSYLKGLWMECVWHSTGIYQCQIYRSLLA 

LPQDLQAARALMGISCLLSGIACACAVIGMKCTR 

CAKGTPAKTTFAILGGTLFILAGLLCMGAVSWTT 

NDWQNFYNPLLPSGMKFEIGQALYLGFISSSLSL 

IGGTLLCLSCQDEAPYRPYQAPPRATTTTANTAP 

AYQPPAAYKDNRAPSVTSATHSGYRLNDYV 


3190 


A 


267 


1037 


DRMAWQGLVLAACLLMFPSTTADCLSRCSLCA 

VKTQDGPKPINPLICSLQCQAALLPSEEWERCQSF 

LSFFTPSTLGLNDKEDLGSKSVGEGPYSELAKLS 

GSFLJUiLbKbl^LPSISlJ^ 

DGFREGAESELMRDAQLNDGAMETGTLYLAEE 

DPKEQVKRYGGFLRKYPKRSSEVAGEGDGDSM 

GHEDLYKRYGGFLRRIRPKLKWDNQKRYGGFLR 

RQFKVVTRSQEDPNAYSGELFDA 


3191 


A , 


29 


574 


GTSAGAQTKGALCQLKVPTEKLPSPLPTMADEID 

rlTuDAGASSTYPMQCSALRKNGFVvLKGRPCK 

IVEMSTSKTGKHGHAKVHLVGIDIFTGICKYEDIC 

PSTHNMDVPNIKRNDYQLICIQDGYLSLLTETGE 

VREDLKLPEGELGKEIEGKYNAGEDVQVSVMCA 

MSEEYAVADCPCK 


3192 


A 


105 


1661 


KVSADGMQSCESSGDSADDPLSRGLRRRGQPRV 
WIGAGLAGLAAAKALLEQGFTDVTVLEASSHIG 
GRVQSVKLGHATFELGATWIHGSHGNPIYHLTE ' 
ANGLLEETTDGERSVGRISLYSKNGVACYLTNH 
GRRIPKDVVEEFSDLYNEVYNLTQEFFRHDKPVN 
AESQNSVGVFTREEVRNRIR>CT 
. AMIQQYLKVESCESSSHSMDEVSLSAFGEWTEIP 
GAHHEPSGFMRVVELLAEGIPAHVIQLGKPVRCI 
HWDQASARPRGPEIEPRGEGDHNHDTGEGGQGG 
EEPRGGRWDEDEQWSVWECEDCELIPADHVIV 
TVSLGVLKRQYTSFFRPGLPTEKVAAIHRLGIGTT 
Drurbbr bbrr WGPbCNSLQF V WJbDEAESH TLTY 
PPELWYRKICGFDVLYPPERYGHVLSGWICGEEA 
LVMEKGDDEAVAEICTEMLRQFTGNPNIPKPRRI 
LRSAWGSNPYFRGSYSYTQVGSSGADVEKLAKP 
I PYTESSKTATK 


3193 


A 


1 


192S 


QLGTRRCLRGDKVTTsIAMQDFLVTNLEPRFIEPQT 
ANLSVWIGDSNSTTPLIFVLSPGTDPAADLYKFA 
EEMKFSKKLSAISLGQGQGPRAEAMMRSSIERGK 
WVFFQNCHLAPSWMPALERLIEHINPDKVHRDF 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
locution 
' corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of. 
peptide 
sequence . 


Amino acid sequence (A=*Alanine OCysteine, D=Aspartic Acid, 
E>=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine t P=ProIine, Q=Glu famine, R=Arginine, S=Serine, 
T«=Threonine, V=VaIine, W=Tryptophan, Y«=Tyrosine, 
X B Unknown, *=Stop codon, 7=possible nucleotide deletion, 
\ppossibte nucleotide insertion 








i 


RLWLTSLPSNKFPVSILQNGSKMTIEPPRGVRAN 

LLKSYSSLGEDFLNSCHKVMEFKSLLLSLCLFHG 

NALERRKFGPLGFNIPYEFrDGDLRICISQLKMFL 

DEYDDIPYKVLKYTAGEINYGGRVTDDWDRRCI 

MNILEDFYNPDVLSPEHSYSASGIYHQIPPTYDLH 

GYLSYIKSLPLNDMPEIFGLHDNANITFAQNETFA 

LLGTOQLQPKSSSAGSQGREEIVEDVTQNILLKVP 

EPINLQWVMAKYPVLYEESMNTVLVQEVIRYNll 

LLQ VITQTLQDLLKALKGLVVMSSQLELMAASL 

YNNTVPEL WSAKAYPSLKPLSS WVMDLLQRLDiF 

LQAWIQDGIPA WWISGFFFPQAFLTGTLQNFAR 

KFVISIDTISFDFKVMFEAPSELTQRPQVGCYIHG . 

LFLEGARWDPEAFQLAESQPKELYTEMAVTWLL 

PTPNRKAQDQDFYLCPIYKTLTRAGTLSTTGHST 

NYVL^VEIPTHQPQRHWIKRGVALICALDY 


3194 


A 


1 


1023 


DGWTPVHAAVDTGNVDSLKLLMYHRIPAHGNS 

FNEEESESSVFDLDGGEESPEGISKPWPADLINH 

ANREGWTAAHIAASKGFKNCLEILCRHGGLEPE 

RRDKCNRTVHDVATDDCKHLLENLNALKIPLRIS 

VGEIEPSNYGSDDLECENTICALNIRKQTSWDDFS 

KAVSQALTNOTQAISSDGWWSLEDVTCNNTTDS 

NIGLSARSIRSITLGNVPWSVGQSFAQSPWDFMR 

KNKAEHITVLLSGPQEGCLSSVTYASMIPLQMM 

QNYLRLVEQYHNVIFHGPEGSLQDYIVHQLALCL 

KHRQMGWQDSPVEIVEELEVGCWFFPREQLLRT 

CSLVA 


3195 


A 


1 


1809 


MAASAQVSVTFEDVAVTFTQEEWGQLDAAQRT 

LYQEVMLETCGLLMSLGCPLFKPELIYQLDHRQE 

LWMATKDLSQSSYPGDNTKPKTTEPTFSHLALPE 

EVLLQEQLTQGASKNSQLGQSKDQDGPSEMQEV 

HLKIGIGPQRGKLLEKMSSERDGLGSDDGVCTKI 

TQKQVSTEGDLYECDSHGPVTDALIREEKNSYK 

CEECGKVFKKNALLVQHERIHTQVKPYECTECG 

KTFSKSTHLLQHLIIHTGEKPYKCMECGKAFNRR 

SHLTRHQRIHSGEKPYKCSECGKAFTHRSTFVLH 

HRSHTGEKPFVCKECGKAFRDRPGFIRHYIIHTGE 

KPYECffiCmCGKAFNRRSYLTWHQQIHTGVKPF 

ECNECGKAFCESADL1QHYIIHTGEKPYKCMECG 

KAFhniRSHLKQHQRIHTGEKPYECSECGKAFTH 

CSTFVLHKRTHTGEKPYECKECGKAFSDRADLIR 

HFSIHTGEKP YEC VECGKAFNRS SHLTRHQQIHT 

GEKPYECIQCGKAFCRSANLIRHSIfflTGEKPYEC 

SECGKAFNRGSSLTHHQRKTGRNPTIVTDVGRP 

FMTAQTSVNIQELLLGKEFLNITTEENLW 


3196 


A 


1400 


264 


VGFWERPLRSSRWFRRSLRRWEMLARAARGTG 

ALLLRGSLLASGRAPRRASSGLPRNTVVLFVPQQ 

EAWVVERMGRFflRILEPGLNILIPVLDRIRYVQSL 

KEIVINVPEQSAVTLDNVTLQIDGVLYLRIMDPY 

KASYGVEDPEYAVTQLAQTTMRSELGKLSLDKV 

FRERESLNASIVDAINQAADCWGIRCLRYEIKDIH 

VPPRVKESMQMQVEAERRKRATVLESEGTRESA 

INVAEGKKQAQELASEAEKAEQINQAAGEASAVL 

AKAKAKAEAIRILAAALTQHNGDAAASLTVAEQ 

YVSAFSKLAKDSNTDLLPSNPGDVTSMVAQAMG 

VYGALTKAPVPGTPDSLSSGSSRDVQGTDASLDE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
. nucleotide 
location 
corresponding, 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine 0=Cysteine, D^Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G«Glycine, H=Histidine, 
I=IsoIcucine, K^Lysine, L=- Leu cine, M-Methionine, 
N=Asparagine t P=ProItne, Q=Glu famine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y^Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










ELDRVKMS 


3197 


A 


66 


3632 


LWECAAAAAGQRDGGVTLFLKGRVLGRRCAAS 

LFAREVCVSTSSSRPACFLHCARARGEQMHQMA 

SGVGSMKRSPRKMWRPGEKKEPQGVVYEDVRD 

DTEDFKEPLKWFEGSAYGLQNFNKQKKLKTCD 

DMDTFFLHYAAAEGQIELMEKITRDSSLEVLHE 

MDDYGNTPLHCA VEKNQIES VKFLLSRGANPNL 

RNFNMMAPLH1AVQGMNNEVMKVLLEHRTIDV 

NLEGENGNTAVIIACTTNNSEALQILLNKGAKPC 

KSNKWGCFPIHQAAFSGSKECMEIILRFGEEHGY 

SRQLHINFMNNGKATPLHLAVQNGDLEMIKMCL 

DNGAQIDPVEKGRCTAIHFAATQGATErVKLMIS 

SYSGSVDIVNTTDGCHETMLHRASLFDHHELAD 

YLISVGADD^JKIDSEGRSPLILATASASWNIVNLL 

LSKGAQVDIKDNFGRNFLHLTVQQPYGLKNLRP 

EFMQMQQIKELVMDEDNDGCTPLHYACRQGGP 

GSVNNLLGFNVSIHSKSKDKKSPLHFAASYGRIN 

TCQRLLQDISDTRLLNEGDLHGMTPLHLAAKNG 

HDKWQLLLKKGALFLSDHNGWTALHHASMGG 

YTQTMKVILDTNLKCTDRLDEDGNTALHFAARE 

GHAKAVALLLSHNADIVLNKQQASFLHLALHNK 

RKEWLTURSKRWDECLKIFSHNSPGNKCPITEM 

IEYLPECMKVLLDFCMLHSTEDKSCRDYYEEYNF 

KYLQCPLEFTKKTPTQDVIYEPLTALNAMVQNN 

R1ELLNHPVCKEYLLMKWLAYGFRAHMMNLGS 

YCLGLIPMTILWNIKPGMAFNSTGIINETSDHSEI 

LDTTNSYLIKTCMILVFLSSIFGYCKEAGQIFQQK 

RNYFMDISNVLEWTIYTTGIIFVLPLFVEIPAHLQ 

WQCGAIA VYFYWMNFLLYLQRFENCGIFIVMLE 

VILKTLLRSTV VFIFLLLAFGLSFYELLNLQDPFS S 

PLLSIIQTFSMMLGDINYRESFLEPYLRNELAHPV 

LSFAQLVSFTIFVPIVLMNLLIGLAVGDIAEVQKH 

ASLKRIAMQVELHTSLEKKLPLWFLRKVDQKSTI 

VYPNKPRSGGMLFHIFCFLFCTGEIRQEIPNADKS 

LEMEILKQKYRLKDLTFLLEKQHELIKLIIQKMEn 

SEIEDDDSHCSFQDRFKKEQMEQRNSRWNTVLR 

AVKAKTHHLEP 


3198 


A 


51 


2177 


KEKSLHHVDQRPPL WHPGRPG TSQS AAMN ASSE 

GESFAGSVQIPGGTTVLVELTPDIHICGICKQQFN 

NLDAFVAHKQSGCQLTGTSAAAPSTVQFVSEET 

VPATQTQTTTRTITSETQTITVSAPEFVFEHGYQT 

YLPTESNENQTATVISLPAKSRTKKPTTPPAQKRL 

NCCYPGCQFKTAYGMKDMERHLKIHTGDKPHK 

CEVCGKCFSRIG3KLKTHMRCHTGVKPYKCKTC 

DYAAADSSSLNKHLRfflSDERPFKCQICPYASRN 

SSQLTVHLRSHTGD APFQCWLCS AKFKIS SDLKR 

HMRVHSGEKPFKCEFCNVRCTMKGNLKSHIRIK 

HSGNNFKCPHCAFLGDSKATLRKHSRVHQSEHR 

EKCSECSYSCSSKAALRIHERIHCTVRPFKCNYCS 

FDSKQPSNLSKHMKKFHGDMVKTEALERKDTG 

RQSSRQVAKLD AKKSFHCDICDASFMREDSLRS 

HKRQHSEYNESKNSDVTVLQFQIDPSKQPATPLT 

VGHLQVPLQPSQVPQFSEGRVKIIVGHQ VPQ ANT 

IVQAAAAAVNIVPPALVAQNPEELPGNSRLQILR 

QVSLIAPPQSSRCPSEAGAMTQPAVLLTTHEQTD 
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SEQH) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence , 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E-=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=4soIeucine, K=Lysine, L^Leucine, M=Methioninc, 
N=Asparagine, P=*Prolinc, Q=Glutamine, R=Arginine, S=Serine, 
T-Thrconine, V^Valine, W=Tfyptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 










GATLHQTLIPTASGGPQEGSGNQTFITSSGITCTD 

FEGLNALIQEGTAEVTWSDGGQNIAVATTAPPV 

FSSSSQQELPKQTYSIIQGAAHPALLCPADSIPD 


3199 


A 


13 


2247 


QSFHSMEGDPSGLPLLARGASCYSLICPCPRPAD 

WSILQGTDWSILQSADWCIYNPLARHRALTGVFL 

QSADWCTYNPLARQKSSPSPHSTQEVQLASPLTR 

RPNKKDSAERNHRPAREGSVAQRQPNPAALEKA 

EPAARKRNEREGGGSQEPGREHSLEKGYWAPGL 

GPDPSMCSKQVDPSEGASSHLKHRGGSRAAHLE 

VRRLLRRLVGALyAEAGFCYVQVAEGQRVVGV 

LEVAEAAAAPVQHEPTAAVATQSRWFPRGTRPG 

LCSLPIAVAALLCPGSGPGAQSGLEFVERPPPSPL, 

AVVLARWPLPPPAGRCPRDAPEARVPEKARAEG 

SERENNYGCGVYGGEMTTLVLDNGAYNAKIGY 

SHENVSVIPNCQFRSKTARLKTFTANQBDEIKDPS 

GLFY1LPFQKGYLVNWDVQRQVWDYLFGKEMY 

QVDFLDTOniTEPYFNFTSIQESMNEILFEEYQFQ 

AVLRVNAGALSAHRYFRDNPSELCCIIVDSGYSF 

TfflVPYCRSKKKKEAIIRINVGGKLLTNHLKEIISY 

RQLHVMDETHVINQVKEDVCYVSQDFYRDMDI 

AKLKGEENTVMEDYVLPDFSTIKKGFCKPREEMV 

LSGKYKSGEQILRLANERFAVPEILFNPSDIGIQE 

MGIPEAIVYSIQNLPEEMQPHFFKN1VLTGGNSLF 

PGFRDRVYSEVRCLTPTDYDVSVVLPENPITYAW 

EGGKLISENDDFEDMVVTREDYEENGHSVCEEK 

FDI 


3200 


A 


3 


307 


AVQRIRHEMNIFRLTGDLSHLAAIVILLLKIWKTR 

SCAGISGKSQLLFALVFTTRYLDLFTSFISLYNTS 

MKVWYAIHRNVFHLQCTGLWTLNLCQLCIFN 


3201 


A 


1 


469 


3RHEGRGQRGKMELVQVLKRGLQQITGHGGLRG 

YLRVFFRTNDAKVGTLVGEDKYGNKYYEDNKQ 

FFGRHRWVWTTEMNGKNTFWDVDGSMVPPE 

WHRWLHSMTDDPPTTKPLTARKFIWTNHKFNVT 

GTPEQYVPYSTTRKKIQEWIPPSTPYK 


3202 


A 


144 

■ ■ . 


840 


NSSQRIMATHALEIAGLFLGGVGMVGTVAVTVM 

PQWRVSAFIENNiVVFENFWEGLWMNCVRQANI . 

RMQCKIYDSLLALSPDLQAARGLMCAASVMSFL 

AFMMAILGMKCTRCTGDNEKVKAHILLTAGIIFII 

TGMVVLIPVSWVANAIIRDFYNSIVNVAQKRELG 

EALYLG WTTALVLIVGG ALFCCVFCCNEKSS SYR 

YSIPSHRTTQKSYHTGKKSPSVYSRSQYV 


3203 


A 


2 


473 


KYRYRRPYPVMRKICQVGPAGLAF1LNISPVAHR 
VALCHLAGCQEQAAWYHTLQILFFLVSAYFFSCP 
VPEKYFPGSCDIVGHGHQIFHAFLSICTLSQLEAEL 
LDYQGRQEIFLQRHGPLSVHMACLSFFFLAACSA 
ATAALLRHKVKARLTKKDS 


3204 


A 


1808 


668 


PESAPLPAFISSRJLPAAWRNWCSYVVTRTISCHV 

QNGTYLQRVLQNCPWPMSCPGSSYRTWRPTYK 

VMYKIVTAREWRCCPGHSRVSCEEVAGSSASLE 

PMWSGSTMRRMALRPTAFSGCLNCSKVSELTER 

LKVLEAKMTMLTVIEQPVPPTPATPEDPAPLWGP 

PPAQGSPGDGGLQDQVGAWGLPGPTGPKGDAG 

SRGPMGMRGPPGDPLLSNTFTETNNHWPQGPTG 

PPGPPGPMGPPGPPGPTGVPGSPGHIGPPGPTGPK 

GISGHPGEKGERGLRGEPGPQGSAGQRGEPGPKG 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide . 
sequence 


Amino acid sequence (A«Alanine OCysteine, D=Aspartic Acid, 
E=Glutamtc Acid, F=PhcnyIalanine, G=Glycine, H=Histidine, 
I~Isoleucine, K=Lysine, LHLeucine, M=Mcthioninc, 
N^Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, W=Tryptophan, V=Tyrosine, 
X-Unknown, *?=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










DPGEKSHWGEGLHQLREALKILAERVLILETMIG 
LYEPELGSGAGPAGTGTPSLLRGKRGGHATNYRI 
VAPRSRDERG 


3205 


A 


2810 


1652 


RTSTQKWQSVFND.SQEHLERFYCNPENDRMRM 

KYGGQEFWADLNAIvINVYETTEFDQLRRLSTPPS 

SNVNSIYHTVWKFFCRDHFGWREYPESVIRLIEE 

ANSRGLKEVRPMMWNNHYILHNSFFRREDCRRP 

LFRSCFILLPYLQTLGGVPTQAPPPLEATSSSQIICP 

DGVTSANFYPETWVYMHPSQDFIQVPVSAEDKS 

YRIIYNLFHKTVPEFKYRILQrLRVQNQFLWEKy 

KRKKEYMNRKMFGRDRIINERHLFHGTSQDVVD 

GICKHNFDPRVCGKHATMFGQGSYFAKICASYSH 

NFSKKSSKGVHFMFLAKVLTGRYTMGSHGMRR 

PPPVNPGSVTSDLYDSCVDNFFEPQIFVIFNDDQS 

YPYFVIQYEEVSNTVSI 


3206 


A 


297 


4500 


CL VDSKL WKG ARS V YHQLFMS SLLMDLK YKKL 

FAVRFAKNYERLQSDYVTDDHDREFSVADLSVQ 

IFTVPSLARMLITEENLMSinKTFMDHLRHRDAQ . 

GRFQFERYTALQAFKFRRVQSLILDLKYVLISKPT 

EWSDELRQKFLEGFDAFLELLKCMQGMDPITRQ 

VGQHIEMEPEWEAAFTLQMKLTHVISMMQDWC 

ASDEKVLIEAYKKCLAVLMQCHGGYTDGEQPIT 

LSICGHSVETIRYCVSQEKVSIHLPVSRLLAGLHV 

LLSKSEVAYKFPELLPLSELSPPMLEEHPLRCLVL 

CAQVHAGMWRRNGFSLVNQIYYYHhT^KCRRE 

MFDIGDVVMLQTGVSMMDPNHFLMIMLSRFELY 

QIFSTPDYGKI^SSEITrnCDVVQQNNTLIEEMLYL 

IMLVGERFSPGVGQVNATDEIKREI1HQLSIKPM 

AHSELVKSLPEDENBCETGMESVEEAVAHFKKPGL 

TGRGMYELKPECAKEFNLYFYHFSRAEQSKAEE 

AQRKLKRQNREDTALPPPVLPPFCPLFASLVNILQ 

SDVMLCIMGTILQWAVEHNGYAWSESMLQRVL 

HLIGMALQEEKQHLENVTEEHVVTFTFTQKISKP 

GEAPKNSPSILAMLETLQNAPYLEVHKDMIRWIL 

KTFNAVKKN1RESSPTSPVAETEGTIMEESSRDKD 

KAERKRKAEIARLRREKIMAQMSEMQRHFIDEN 

KELFQQTLELDASTSAVLDHSPVASDMTLTALGP 

AQTQVPEQRQFVTCILCQEEQEVKVESRAMVLA 

AFV QRSTVLSKNRSKFIQDPEKYDPLFMHPDLSC 

GTHTSSCGHIMHAHCWQRYFDSVQAKEQRRQQ 

RLRLHTSYDVENGEFLCPLCECLSNTVIPLLLPPR 

NIFNNRLOTSDQPNLTQWIRTISQQIKALQFLRKE 

ESTPNNASTKNSENVDELQLPEGFRPDFRPKEPYS 

ESIKEMLTTFGTATYKVGLKVHPNEEDPRVPIMC 

WGSCAYTIQSIERILSDEDKPLFGPLPCRLDDCLR 

SLTRFAAAHWTVASVSVVQGHFCKPFASLVPND . 

SHEELPCILDIDMFHLLVGLVLAFPALQCQDFSGI 

SLGTGDLmFHLVTMAHnQILLTSCTEENGMDQE 

NPPCEEESAVLALYKTLHQYTGSALKEEPSGWHL 

WRSVRAGIMPFLKCSALFFHYLNGVPSPPDIQVP 

GTSHFEHLCSYLSLPNNLICLFQENSEIMNSLIES 

WCRNSEVKRYLEGERDAIRYPRESNKLINLPEDY 

SSLINQASNFSCPKSGGDKSRAPTLCLVCGSLLCS 

QSYCCQTELEGEDVGACTAHTYSCGSGVGIFLR 

VRECQVLFLAGKTKGCFYSPPYLDDYGETDQGL 
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NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCystcine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G-GIycine, H^Histiditie, 
I=lsoleucine, KpLysine, l/=Leucinc, M^Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T=Thrconine, V= Valine, \V~Tryptophan, Y*=Tyrosine, 
X=Un known, *<=Stop codon, /^possible nucleotide deletion, 
V=possi ble nucleotide insertion 










RRGNPLHLCKERFKKIQKLWHQHSVTEEIGHAQ 
EANQTLVG1DWQHL 


3207 


A 


49 


963 


QLSPSQAPAGAQEVARRVTVGSASHGGRRSTMA 
TTVSTQRGPVYIGELPQDFLRItPTQQQRQVQLD 
AQAAQQLQYGGAVGTVGRLNITVVQAKJLAICNY 
GMTRMDPYCRLRLGYAVYETPTAHNGAKNPRW 
NKVIHCTVPPGVDSFYLEIFDERAFSMDDRIAWT - 

VMSYALLPAAMVMPPQPWLMPTVYQQGVGY 
VPITGMPAVCSPGMVPVALPPAAVNAQPRCSEE 
DLKA1QDMFPNMDQEV1RSVLEAQRGNKDAAIN 
SLLQMGEEP 


3208 


A 


54 


1196 


LERTPASADMAWTKYQLFLAGLMLVTGSINTLS • 

AKWADNFMAEGCGGSKEHSFQHPFLQAVGMFL 

GEFSCLAAFYLLRGRAAGQSDSSVDPQQPFMPLL 

FLPPALCDMTGTSLMYVALNMTSASSFQMLRGA 

VIIFTGLFSVAFLGRRLVLSQWLGrLATIAGLVW 

GLADLLSKHDSQHKLSEVITGDLLIIMAQIIVAIQ 

MVLEEKFVYKHNVHPLRAVGTEGLFGFVILSLLL 

VPMYYlFAG^FSGNrRG 1 LbDALUArCQ VOyQr 

JLIA VALLGNIS SIAFFOTAGIS VTKELS ATTRMVL 

DSLRTVVIWALSLALGWEAFHALQELGFLILLIGT 

ALYNGLHRPLLGRLSRGRPLAEESEQERLLGGTR 

TPINDAS 


3209 


A 


104 


1999 


AKVVSLKEFSCFWRREKPVSSLSSLQVKAEASW 

DSAVHGCPQLSRGTPVDERLFLIVRVTVQLSHPA 

DMQLVLRKRICVNVHGRQGFAQSLLKKMSHRSS 

IPGCGVTFEIVSNIPEDAQGVEEREALARMAANV 

ENPAS ADSEAYIEKYLRSVLAVENLLTLDRLRQE ' 

VAVKEQLTGKGKLSRRSISSPNVNRLSGSRQDLIP 

SYSLGSNKGRWESQQDVSQTTVSRGIAPAPALSV 

SPQNNHSPDPGLSNLAASYLNPVKSFVPQMPKLL 

KSLrTVRDEKRGKRPSPLAHQPVPRIMVQSASPDI 

RVTRMEEAQPEMGPDVLVQTMGAPALKICDKP 

AKVPSPPPVIAVTAVTPAPEAQDGPPSPLSEASSG s 

YFSHSVSTATLSDALGPGLDAAAPPGSMPTAPEA 

EPEAPISHPPPPTAVPAEEPPGPQQLVSPGRERPDL 

EAPAPGSPFRVRRVRASELRSFSRMLAGDPGCSP 

u AEON Ar: Ar G Auu vAJLAbUbbbAUii V rcW LKbu 

EFVTVG AHKTGVVRYVGPADFQEGTWVG VELD 

LPSGKNDGSIGGKQYFRCNPGYGLLVRPSRVRR 

ATGPVRRRSTGLRLGAPEARRSATLSGSATNLAS 

T TA AT AV AT^T>CTJVT\IT>TTKTDVCWAC 
\-j 1 AAJLAIxAJL^itoJrilSJNJr\D^ WAo 


3210 


A. 


324 


694 


SPFWTEKRRMEKPLFPLVPLHWFGFGYTALVVS 
GGIVGYVKTGSVPSLAAGLLFGSLAGLGAYQLY 
QDPRNVWGFLAATSVTFVGVMGMRSYYYGKF 
MPVGLIAGASLLMAAKVGVRMLMTSD 


321 1 


A 


1078 


594 


V KjMtrLr A V iMUv V lhLKjri wLL 11 W V r iuo r A 

WANFTILALGVWAVAQRDSIDAISMFLGGLLATI 
FLDIVHISIFYPRVSLTDTGRFGVGMAILSLLLKPL 

TIDSAEAPADPFAVPEGRSQDARGY 


3212 


A 


1 


1962 


FRCGLAPKGRPRRRADPVASAIMDPAEAVLQEK 
ALKFM^rTlSWCPG\yNTKARSRLTATSTSRVQ 
CSMPRSLWLGCSSLADSMPSLRCLYNPGTGALT 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence - 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine OCysteine, D-Aspartic Acid, 
E^Glutamic Acid, ^Phenylalanine, G=G!ycine, H-Histidine, 
I=Isolcucinc, K=Lysine, L=Leucine, M^Methionine, 
N=Asparagine, P«Proline, Q=Glutamine, R»Arginine, S^Serine, 
T=Threoninc, V«=Valine, W^Tryptophan, Y=f yrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










AFQNSSEREDCNNGEPPRKIIPEKNSLRQTYNSCA 

RLCLNQETVCLASTAMKTENCVAKTKLANGTSS 

MIVPKQRKLSASYEKEKELCVKYFEQWSESDQV 

EFVEHLISQMCHYQHGHINSYLKPMLQRDFITAL 

PARGLDHIAENILSYLDAKSLCAAELVCKEWYR 

VTSDGMLWKKLIERMVRTDSL WRGLAERRGWG 

QYLFKNKPPDGNAPPNSFYRALYPKHQDIETIES 

NWRCGRHSLQRIHCRSETSKGVYCLQYDDQKIV 

SGLRDNTIKIWDKNTLECICRILTGHTGSVLCLQY 

DERVirTGSSDSTVRVWDVNTGEMLNTLIHHCEA 

\aHLRFNNGMMVTCSKDRSIAVWDMASPTDITL 

RRVLVGHRAAVNVVDFDDKYIVSASGDRTIKV 

WNTSTCEFVRTLNGHKRGIACLQYRDRLVVSGS 

SDNTIRLWDIECGACLRVLEGHEELVRCIRFDNK 

RiySGAYDGKIKVWDLVAALDPRAPAGTLCLRt 

LVEHSGRVFRLQFDEFQIVSSSHDDTILIWDFLND 

PAAQSEPPRSPSRTYTYISR 


3213 


A 


1 


1962 

. *.■'■ i 


FRCGLAPKGRPRRRADPVASAIMDPAEAVLQEK 
ALKFMMEFRSWCPGWNTMARSRLTATSTSRVQ 
CSMPRSLWLGCSSLADSMPSLRCLYNPGTGALT 
AFQNSSEREDCNNGEPPRKIIPEKNSLRQTYNSCA 
RLCLNQETVCLASTAMKTENC VAKTKLAN GTSS 
MIVPKQRKLSASYEKEKELCVKYFEQWSESDQV 
EFVEHLISQMCHYQHGHINSYLKPMLQRDFITAL 
PARGLDHIAENILSYLDAKSLCAAELVCKEWYR 
VTSDGMLWKKLIERMVRTDSL WRGLAERRGWG 
QYLFKNKPPDGNAPPNSFYRALYPKIIQDIETIES 
NWRCGRHSLQRIHCRSETSKGVYCLQYDDQKTV 
SGLRDNTIKIWDKNTLECKRILTGHTGSVLCLQY. 
DERVIITGSSDSTVRVWDVNTGEMLNTLIHHCEA 
. VLHLRFNNGMMVTCSKDRSIAVWDMASPTDITL 
RRVLVGHRAAVNVVDFDDKYIVSASGDRTIKV 
WNTSTCEFVRTLNGHKRGIACLQYRDRLVVSGS- 
SDNTERLWDIECGACLRVLEGHEELVRCIRFDNK 
RTVSGAYDGKIKVWDLVAALDPRAPAGTLCLRT 
LVEHSGRVFRLQFDEFQIVSSSHDDTELIWDFLND 
PAAQSEPPRSPSRTYTYISR 


3214 


A 


1 


1962 


FRCGLAPKGRPRRRADPVASAIMDPAEAVLQEK 
ALKFMMEFRSWCPGWNTMARSRLTATSTSRVQ 
CSMPRSLWLGCSSLADSMPSLRCLYNPGTGALT 
AFQNSSEREDCNNGEPPRKIIPEKNSLRQTYNSCA 
RLCLNQETVCLASTAMKTENCVAKTKLANGTSS 
MTVPKQRKLSASYEKEKELCVKYFEQWSESDQV 
EFVEHLISQMCHYQHGHINSYLKPMLQRDFITAL 
PARGLDHIAENILSYLDAKSLCAAELVCKEWYR 
VTSDGMLWKKLIERMVRTDSL WRGLAERRGWG 
QYLFKNKPPDGNAPPNSFYRALYPKIIQDIETIES 
. NWRCGRHSLQRIHCRSETSKGVYCLQYDDQKJV 
SGLRDNTIKIWDKNTLECKRILTGHTGSVLCLQY 
DERVIITGSSDST\^VWD\TNTGEMLNTLIHHCEA 
VLHLRFNNGMMVTCSKDRSIAVWDMASPTDITL 
RRVLVGHRAAVNVVDFDDKYTVSASGDRTDCV 
WNTSTCEFVRTLNGHKRGIACLQYRDRLVVSGS 
SDNTIRLWDIECGACLRVLEGHEELVRCIRFDNK 
RTVSGAYDGKIKVWDLVAALDPRAPAGTLCLRT 
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SEQ n> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phcnylalanine, G^GIycine, H=Histidine, 
I=Isolcucine, K=Lysine, LHLeucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T«=Threonine, V^Valine, W=Tryptophan t Y=Tyrosine, 
X = Unknown, *~Stop codon, £=possible nucleotide deletion, 
possible nucleotide insertion 










LVEHSGRVFRLQFDEFQIVSSSHDDTILIWDFLND 
PAAQSEPPRSPSRTYTYTSR 


3215 


A 


2 


1376 


EARLVGCQRGGPARPGSYSSGAETAGRAMAAN 

LSRNGPALQEAYVRVVTEKSPTDWALFTYEGNS 

NDIRVAGTGEGGLEEMVEELNSGKVMYAFCRV 

KDPNSGLPKFVLINWTGEGVNDVRKGACASHVS 

TMASFLKGAHVTTNARAEEDVEPECIMEKVAKA 

SGANYSFHKESGRFQDVGPQAPVGSVYQKTNAV 

SEIKRVGKDSFWAKAEKEEENRRLEEKRRAEEA 

QRQLEQERRERELREAARREQRYQEQGGEASPQ 

RTWEQQQEWSRNRIsfEQESAVHPREIFKQKERA 

MSTTSISSPQPGKLRSPFLQKQLTQPETHFGREPA 

AAISRPRADLPAEEPAPSTPPCLVQAEEEAVYEEP 

PEQETFYEQPPLVQQQGAGSEHIDHHIQGQGLSG 

QGLCARALYDYQAADDTEISFDPENLITGIEVIDE 

GWWRGYGPDGHFGMFPANYVELIE 


3216 


A 


936 


204 


AMASTLEYSPSPLRRLVGPAAGFSRAARADLSW 

DPMAFFTGLWGPFTCVSRVLSHHCFSTTGSLSAI 

QKMT11VRVVDNSALGNSPYHRAPRCIHVYKKN 

GVGKVGDQILLAIKGQKKKALIVGHCMPGPRMT 

PRFDSNNVVLIEDNGNPVGTRIKTPIPTSLRKREG 

EYSKVLAIAQNFV 


3217 


A 


1 


1563 


MLCALLLLPSLLGATRASPTSGPQECAKGSTVW 

CQDLQTAARCGAVGYCQGAVWNKPTAKSLPCD 

VCQDIAAAAGNGLNPDATESDILALVMKTCEWL 

PSQESSAGCKWMVDAHSSAELSMLRGAPDSAPA 

QVCTALSLCEPLQRHLATLRPLSKEDTFEA VAPF 

MANGPLTFHPRQAPEGALCQDCVRQVSRLQEAV 

RSNLTLADLNIQEQCESLGPGLAVLCKNYLFQFF 

VPADQALRLLPPQELCRKGGFCEELGAPARLTQ 

WAMDGVPSLELGLPRKQSEMQMKAGVTCEVC 

MNWQKLDHWLMSNSSELMITHALERVCSVMP 

ASITKECIILVDTYSPSLVQLVAK1TPEKVCKFIRL 

CGNRRRARAVHDAYAIVPSPEWDAENQGSFCNG 

CKRLLWSSHNLESKSTKRDILVAFKGGCSILPLP 

YMIQCKHFVTQYEPVLIESLKDMMDPVAVCKKV 

GACHGPRTPLLGTDQCALGPSFWCRSQEAAKLC 

NAVQHCQKHVWKEMHLHAGEHA 


3218 


A 


1 


1563 


MLCALLLLPSLLGATRASPTSGPQECAKGSTVW 

CQDLQTAARCGAVGYCQGAYWNKPTAKSLPCD 

VCQDIAAAAGNGLNPD ATESDILALVMKTCEWL 

PSQESSAGCKWMVDAHSSAILSMLRGAPDSAPA 

QVCTALSLCEPLQRHLATLRPLSKEDTFEA VAPF 

MANGPLTFHPRQAPEGALCQDCVRQVSRLQEAV 

RSNLTLADLNIQEQCESLGPGLAVLCICNYLFQFF 

VPADQALRLLPPQELCRKGGFCEELGAPARLTQ 

VVAMDGVPSLELGLPRKQSEMQMKAGVTCEVC 

MNWQKLDHWLMSNSSELMITHALERVCSVMP 

ASITKECIILVDTYSPSLVQLVAKITPEKVCKFrRL 

CGNRRRARAVHDAYAIVPSPEWDAENQGSFCNG 

CI<RLLWSSHNLESKSTKRDILVAFKGGCSILPLP 

YMIQCKOTVTQYEPVLIESLKDMMDPVAVCICKV 

GACHGPRTPLLGTDQCALGPSFWCRSQEAAKLC 

NAVQHCQKHVWKEMHLHAGEHA 


3219 


A 


1623 


572 


TSAEGWKGCTCTFKDRSKLREHLRSHTQEKWA 
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SEQH) 
NO: 


Method 


Predicted 

beginning. 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
' corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystcine, D=Aspartic Acid, 
EKJIutamic Acid, F=Phcnylalaninc, G=Glycine, H~Histidine, 
)=Isoleucine, K=Lysine, Ls=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q*=Glutaimne, R=Argininc, S^Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *»=Stop codon, /=possible nucleotide deletion, 
\= possible nucleotide insertion 










CPTCGGMFANNTKFLDHERRQTSLDQQHFQCSH 

CSKI^ATERLLRDHMRNHWHYKCPLCDMTCPL 

PSSLRNHMRFRHSEDRPFKCDCCDYSCKNLIDLQ 

KHLDTHSEEPAYRCDFENCTFSARSLCSIKSHYR 

KVHEGDSEPRYKCHVCDKCFTRG1WLTVHLRK 

KHQFKWPSGHPRFRYKEHEDGYMRLQLVRYES 

VELTQQLLRQPQEGSGLGTSLNESSLQGIILETVP 

GEPGRKEEEEEGKGSEGTALSASQDNPSSVIHW 

NQTNAQGQQEIVYYVLSEAPGEPPPVPEPPSGGI 

MEKLQGIAEEPEIQMV 


3220 


A 


2760 

> 


745 


SLGPSGNTRGTGLVLDGDTSYTYHLVCMGPEAS 

GWGQDEPQTWPTDHRAQQGVQRQGVSYSVHA 

YTGQPSPRGLHSENREDEGWQVYRLGARDAHQ 

GRPTWALRPEDGEDKEMKTYRLDAGDADPRRL 

CDLERERWAVIQGQAVRKSSTVATLQGTPDHGD 

PRTPGPPRSTPLEENWDREQIDFLAARQQFLSLE 

QANKGAPHSSPARGTPAGTTPGASQAPKAFNKP 

HLANGHVVPIKPQVKGVVREENKVRAVPTWAS 

VQVVDDPGSLASVESPGTPKETPIEREIRLAQERE 

ADLREQRGLRQATDHQELVEIPTRPLLTKLSLITA 

PRRERGRPSLYVQRDIVQETQREEDHRREGLHV 

GRASfPDWVSEGPQPGLRRALSSDSILSPAPDAR 

AADPAPEVRKVNRIPPDAYQPYLSPGTPQLEFSA . 

FGAFGKPSSLSTAEAKAATSPKATMSPRHLSESS 

GKPLSTKQEASKPPRGCPQANRGVVRWEYFRLR 

PLRFRAPDEPQQAQVPHVWGWEVAGAPALRLQ 

KSQSSDLLERERESVLRREQEVAEERRNALFPEV 

FSPTPDENSDQNSRSSSQASGITGSYSVSESPFFSPI 

HLHSNVAWTVEDPVDSAPPGQRKKEQWYAGIN 

PSDGINSEVLEAIRVTRHKNAMAERWESRIYASE 

EDD 


3221 


A 


15 


478 


SRVFFFFFFFPAFKMSKRGRGGSSGAKFRISLGLP 
VGAVINCADNTGAKNLYnSVKGIKGRLNRLPAA 
GVGDMVMATVKKGKPELRKKVHPAVVIRQRKS 
YRRKDG\^YFEDNAGVIV>0«GEMKGSAITGP 
VAKECADLWPRIASNAGSIA 


3222 


A 


207 


1321 


PLIPLHPANRSPATMAELQEVQITEEKPLLPGQTP 

EAAKTHSVETPYGSVTFTVYGTPKPKRPA1LTYH 

DVGLNYKSCFQPLFQFEDMQEHQNFVRVHVDAP 

GMEEGAPVFPLGYQYPSLDQLADMIPCVLQYLN 

FSTnGVGVGAGAYILARYALNHPDTVEGLVLINI 

DPNAKGWMDWAAHKLTGLTSSIPEMILGHLFSQ 

EELSGNSELIQKYRMITHAPNLDNIELYWNSYNN 

RRDLNFERGGDITLRCPVMLVVGDQAPHEDAVV 

ECNSKLDPTQTSFLKMADSGGQPQLTQPGKLTE 

AFKYFLQGMGYMASSCMTRLSRSRTASLTSAAS 

VDGNRSRSRTLSQSSESGTLSSGPPGHTMEVSC 


3223 


A 


132 


1664 


SARRWGAAGAGPHGLHLRAHGPRPSVRTGLPSV 

GRQAAGAAMGRGWGFLFGLLGAVWLLSSGHGE 

EQPPETAAQRCFCQVSGYLDDCTCDVETIDRFNN 

YRLFPRLQKLLESDYFRYYKVNLKRPCPFWNDIS 

QCGRRDCAVKPCQSDEVPDGIKSASYKYSEEAN 

NLIEECEQAERLGAVDESLSEETQKAVLQWTKH 

DDSSDNFCEADDIQSPEAEYVDLLLNPERYTGYK 

GPDAWKIWNVIYEENCFKPQTIKRPLNPLASGQG 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
■ nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D=Aspartic Acid, 
E-Glutamic Acid, F=Pheny lain nine, G=Giycine, H=Histidine, 
l^lsolcucine, K^Lysine, L=Leucine, M-Methionine, 
N=Asparaginc, P=Proline, Q=GIutaminc, R=Arginine, S=Serine, 
T=Threonine, V«VaIine, W«=Tryptopban, Y«=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










TSEENTFYSWLEGLCVEKRAFYRLISGLHASINV 

HLSARYLLQETWLEKKWGHNITEFQQRFDGILTE 

GEGPRRLKNLYFLYLIELRALSKVLPFFERPDFQL 

FTGNK3QDEENKMLLLEILHEIKSFPLHFDENSFF 

AGDKKEAHKLKEDFRLHFRNISRJMDCVGCFKC 

RLWGKLQTQGLGTALKILFSEKLIANMPESGPSY 

EFHLTRQETVSLFNAFGTQSYKCERIRKTSRNLLQ 

NIH 


3224 


A 


2 


803. 


PGSTISWDRDAAGESGTRAASPSPSGSRTAGRLP 

SPSYSPLPAPSLFPPPPLPAPAASTMSAGGDFGNP 

LRKFKLVFLGEQSVGKTSLITRFMYDSFDNTYQA 

TIGIDFLSKTMYLEDRTVRLQLWDTAGQERFRSL 

IPSYIRDSTVAVWYDITNLNSFQQTSKWIDDVRT 

ERGSDVIIMLVGNKTDLADKRQITIEEGEQRAKE 

LSVMFIETSAKTGYNVKQLFRRVASALPGMENV 

QEKSKEGMIDIKLDKPQEPPASEGGCSC 


3225 . 


A 


3 


5054 


PEVTKPSLSQPTAASPIGSSPSPPVNGGMNAKRVA 

VPNGQPPSAARYMPREVPPRFRCQQDHKVLLKR 

GQPPPPSCMLLGGGAGPPPCTAPGANPNNAQVT 

GALLQSESGTAPDSTLGGAAASNYANSTWGSGA 

SSNNGTSPNPIfflWDKVIVDGSDMEEWPCIASKD 

TESSSENTTDNNSASNPGSEKSTLPGSTTSNKGK 

GSQCQSASSGNECNLGVWKSDPKAKSVQSSNST 

TENNNGLGNWRNVSGQDRIGPGSGFSNFNPNSN 

PSAWPALVQEGTSRKGALETDNSNSSAQVSTVG . 

QTSREQQSKMENAGVNFVVSGREQAQIHNTDGP 

KNGNTNSLNLSSPNPMENKGMPFGMGLGNTSRS 

TDAPSQSTGDRKTGSVGSWGAARGPSGTDTVSG 

QSNSGNNGNNGKEREDSWKGASVQKSTGSKND 

SWDNNNRSTGGSWNFGPQDSNDNKWGEGNKM 

TSGVSQGEWKQPTGSDELKIGEWSGPNQPNSST 

GAWDNQKGHPLLENQGNAQAPCWGRSSSSTGS 

EVEGQSTGSNHKAGSSDSHNSGRRSYRPTHPDC 

QAVLQTLLSRTDLDPRVLSNTGWGQTQIKQDTV 

WDIEEVPRPEGKSDKGTEGWESAATQTKNSGG 

WGDAPSQSNQMKSGWGELSASTEWKDPKNTGG 

WNDYKbMNSSNWGGGRPDEKTPSSWNENPSKD 

QGWGGGRQPNQGWSSGKNGWGEEVDQTBCNSN 

WESSASKPVSGWGEGGQNEIGTWGNGGNASLA 

SKGGWEDCKRSPAWNETGRQPNSWNKQHQQQ 

QPPQQPPPPQPEASGSWGGPPPPPPGNVRPSNSS 

WSSGPQPATPKDEEPSGWEEPSPQSISRKMDIDD 

GTSAWGDPNSYNYIOmOLWDKNSQGGPAPREP 

NLPTPMTSKSASDSKSMQDGWGESDGPVTGARH 

PSWEEEEDGGVWNTTGSQGSASSHNSASWGQG 

GKKQMKCSLKGGNNDSWMNPLAKQFSNMGLL 

SQTEDNPSSKMDLSVGSLSDKKFDVDKRAMNLG 

DFND1MRKDRSGFRPPNSKDMGTTDSGPYFEKG 

GSHGLFGNSTAQSRGLHTPVQPLNSSPSLRAQVP 

PQFISPQVSASMLKQFPNSGLSPGLFNVGPQLSPQ 

QRKISQAVRQQQEQQLARMVSALQQQQQQQQR 
QPGMKHSPSHPVGPBCPHLDNMVPNALNVGLPDL 
QTKGPIPGYGSGFSSGGMDYGMVGGKEAGTESR 
FKQWTSMMEGLPSVATQEA^IMHKNGAIVAPGK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A*=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F«PhenyIa1anine, G=G lycine, H=Histidine t 
I^lsoleucine, K=Lysine, LHLeucine, M=Methionine, 
N=Asparaginc, P=Proline, Q=Gtutamine, R=Arginine, S=Serine, 
TXThreonine, V^Valine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V= possible nucleotide insertion 










TRGGSPYNQFDIIPGDTLGGHTGPAGDSWLPAKS 

PPTNKIGSKSSNASWPPEFQPGVPWKGIQNIDPES 

DPYVTPGSVLGGTATSPIVDTDHQLLRDNTTGSN 

SSLNTSLPSPGAWPYSASDNSFTNVHSTSAKFPD 

YKSTWSPDPIGHNPTHLSNmWK>raSSRNTTPL 

PRPPPGLTNPKPSSPWSSTAPRSVRGWGTQDSRL 

ASASTWSDGGSVRPSYWLVLHNLTPQIDGSTLRT 

ICMQHGPLLTFHLNLTQGTALIRYSTXQEAAKAQ 

TALHMCVLGNTTILAEFATDDEVSRFLAQAQPPT 

PAATPSAPAAGWQSLETGQNQSDPVGPALNLFG 

GSTGLGQWSSSAGGSSGADLAGASLWGPPNYSS 

SLWGVPTVEDPHRMGSPAPLLPGDLLGGGSDSI 


3226 


A 


200 


1387 


VPWKRQDEQLSLQVETLYLDSPAVIHLLSPTFLP 

PSSLPPFLQIVDSSSSACTLDSFFPFLAPWDSPQDC 

GFKDHQPLTLQALTVELARWTLMLLLSTAMYG 

AHAPLLALCHVDGRVPFRPSSAVLLTELTKLLLC - 

AFSLLVGWQAWPQGPPPWRQAAPFALSALLYG 

ANNNLVIYLQRYMDPSTYQVLSNLKIGSTAVLY 

CLCLRHRLSVRQGLALLLLMAAGACYAAGGLQ 

VPGNTLPSPPPA A AA SPMPLHITPLGLLLLILYCLI 

SGLSSVYTELLMKRQRLPLALQNLFLYTFGVLLN 

LGLHAGGGSGPGLLEGFSGWAALVVLSQALNGL 

LMSAVMKHGSSITRLFVVSCSLVVNAVLSAVLL 

RLQLTAAFFLATLLIGLAMRLYYGSR 


3227 


A 


1. 


679 


RSTRARTRRPGLRAVPLPVGGFLGKMKWVWAL 

LLLAALGSGRAERDCRVSSFRVKENFDKARFSGT 

WYAMAKKDPEGLFLQDNrVAEFSVDETGQMSA 

TAKGRVRLLNNWDVCADMVGTFTDTEDPAKFK 

MKYWGVASFLQKGNDDHWIVDTDYDTYAVQY 

SCRLLNLDGTCADSYSFVFSRDPNGLPPEAQKIV 

RQRQEELCLARQYRLIVHNGYCDGRSERNLL 


3228 


A 


430 


1104 


QQESPAAGAARMNCKEGTDSSCGCRGNDEKKM 

LKCVWGDGAVGKTCLLMSYANDAFPEEYVPT 

VFDHYAVTVTVGGKQHLLGLYDTAGQEDYNQL 

RPLSYPNTDVFLICFSWNPASYHNVQEEWVPEL 

KDCMPHVPYYLIGTQIDLRDDPKTLARLLYMKE 

KPLTYEHGVKLAKAIGAQCYLECSALTQKGLKA 

WDEAILTIFHPKKKKKRCSEGHSCCSn 


3229 


A 


25 


722. 


AISAGRSAKMQLKPMEINPEMLNKVLSRLGVAG 

QWRFVDVLGLEEESLGSVPAPACALLLLFPLTAQ 

HENFRKKQIEELKGQEVSPKVYFMKQTIGNSCGT 

IGLIHAVANNQDKLGFEDGSVLKQFLSETEKMSP 

EDRAKCFEKNEA1QAAHDAVAQEGQCRVDDKV 

NFHFDLFNNVDGHLYELDGRMPFPVNHGASSEDT 

LLKDAAKVCREFTEREQGEVRFSAVALCKAA 


3230 


A 


282 


1479 


GDAATTACAPPDWFLGPRKLAAGPAGGGMLPR 

RLLAAWLAGTRGGGLLALLANQCRFVTGLRVR 

RAQQIAQLYGRLYSESSRRVLLGRLWRRLHGRP 

GHASALMAALAGVFVWDEERIQEEELQRSINEM 

KRLEEMSNMFQSSG VQHHPPEPKAQTEGNEDSE 

GKEQRWEMVMDKKHFKLWRRPITGTHLYQYRV 

FGTYTDVTPRQFFNVQLDTEYRKKWDALVIKLE 

VIERJDVVSGSEVLHWVTHFPYPMYSRDYVYVRR 

YSVDQENNMMVLVSRAVEHPSVPESPEFVRVRS 

YESQMVIRPHKSFDENGFDYLLTYSDNPQTVFPR 
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SEQH) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I^lsolcucinc, K=Lysine, l/=Leucinc, M=Metbionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Argininc, S=Serine, 
T«Threonine, V^Vaiinc, W=Tryptophan, Y=Tyrosine t 
X«Unknown, *«Stop codon, ^possible nucleotide deletion, 
\=possib)e nucleotide insertion 

n • ■ - 










YCVSWMVSSGMPDFLEKLHMATLKAKNMEIKV 
KDYISAKPLEMSSEAKATSQSSERKNEGSCGPAR 
IEYA 


3231 


A 


2117 


590 


FVPEPPEAGASSPCAPGDPDMSFRKWRQSKFRH 

WGQPVKNDQCYEDIRVSRVTWDSTFCAVNPKF 

LAVIVEASGGGAFLVLPLSKTGRIDKAYPTVCGH 

TGP\^DmWCPHNDEVL^SGSEDCTVMVWQIPE 

NGLTSPLTEPVVVLEGHTKRVGIIAWHPTARNVL 

LSAGCDNWLIWNVGTAEELYRLDSLHPDLIYN 

VSWNHNGSLFCSACKDKSVRIIDPRRGTLVAERE 

KAHEGARPMRAIFLADGKVFTTGFSRMSERQLA 

LWDPENLEEPMALQELDSSNGALLPFYDPDTSV 

VYVCGKGDSSIRYFEITEEPPYIHFLNTFTSKEPQR 

GMGSMPKRGLEVSKCEIARFYKLHERKCEPIVM 

TVPRKSDLFQDDLYPDTAGPEAALEAEEWVSGR 

DADPDLISLREAYVPSKQRDLKISRRNVLSDSRPA 

MAPGSSHLGAPASTTTAADATPSGSLARAGEAG 

KLEEVMQELRALRALVKEQGDRJCRLEEQLGRM 

ENGDA . 


3232 


■A 


3 


718 


RLREDDRRGLPLSSPLWTEPPLSCCLPATYPADM 
GTAGAMQLCWV1LGFLLFRGHNSQPTMTQTSSS 
QGGLGGLSLTTEPVSSNPGYIPSSEANRPSHLSST 
GTPGAGVPSSGRDGGTSRDTFQTVPPNSTTMSLS 
MREDATILPSPTSETVLTVAAFGVISFIVILVVWI 
ILVGVVSLRFKCRKSKESEDPQKPGSSGLSESCST 
ANGEKDSITL1SMKNINMNNGKQSLSAEKVL 


3233 


A . 


3 . 


718 


RLREDDRRGLPLSSPLWTEPPLSCCLPATYPADM 
GTAGAMQLCWVILGFLLFRGHNSQPTMTQTSSS 
QGGLGGLSLTTEPVSSNPGYIPSSEANRPSHLSST 
GTPGAGVPSSGRDGGTSRDTFQTVPPNSTTMSLS 
MREDATILPSPTSETVLTVAAFGVISFIVILVVVVI 
ILVGWSLRFKCRKSKESEDPQKPGSSGLSESCST 
ANGEKDSITLISMKNINMNNGKQSLSAEKVL 


3234 


A 


1169 


4292 


AGDCGRLGVGGSEFPWEGSALGASPLPPICLQSR 

TWLLRAPAPAELGELEEVAAGRGDVWEPFLDSP 

GREESLQEASPRLADHGSSSGGGWEVKRSQRLR 

RGPSSPRRPYQDMEYERRGGRGDRTGRYGATDR 

SQDDGGENRSRDHDYRDMDYRSYPREYGSQEG 

KHDYDDSSEEQSAEDSYEASPGSETQRRRRRRH 

RHSPTGPPGFPRDGDYRDQDYRTEQGEEEEEEED 

EEEEEKASNIVMLRMLPQAATEDDIRGQLQSHG 

VQAREVRLMR>HCSSGQSRGFAFVEFSHLQDATR 

WMEANQHSLN1LGQKVSMHYSDPKPKINEDWL 

CNKCGVQNFKRREKCFKCGVPKSEAEQKLPLGT 

RLDQQTLPLGGRELSQGLLPLPQP YQAQG VLAS 

QALSQGSEPSSENANDTHLRNLNPHSTMDSILGA 

LAPYAVLSSSNVRVIKDKQTQLNRGFAFIQLSTIE 

AAQLLQDLQALHPPLTIDGKTINVEFAKGSKRDM 

ASNEGSRISAASVASTAIAAAQWAISQASQGGEG 

TWATSEEPPVDYSYYQQDEGYGNSQGTESSLYA 

HGYLKGTKGPGITGTKGDPTGAGPEASLEPGADS 

VSMQAFSRPQPGAAPGIYQQSAEASSSQGTAANS 

QSYTIMSPAVLKSELQSPTHPSSALPPATSPTAQE 

SYSQYPWDVSTYQYDETSGYYYDPQTGLYYDP 

NSQYYYNAQSQQYLYWDGERRTYVPALEQSAD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AJanine C=Cysteine, D=As parti c Acid, 
E^Glutamic Acid, F=Pheny!alanine, G=GIycine, H=Histidine, 
I-Isoleucine, K=Lysine, LHLeucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutaniine, R-Argininc, S=Scrine, 
T=Threonine, V=VaIine, W=Tryptophan ( Y=Tyrosine, 
X^Unknown, *=Stop cod on, £=possible nucleotide deletion, 
^possible nucleotide insertion 










GHKETGAPSKEGKEKKEKHKTKTAQQIAKDME 

RWARSLNKQKENFKNSFQPISSLRDDERRESATA 

DAGYAILEKKGALAERQHTSMDLPICLASDDRPS 

PPRGLVAAYSGESDSEEEQERGGPEREEKLTDW 

QKLACLLCRRQFPSKEALIRHQQLSGLHKQNLEI 

HRRAHL SENELE ALEKNDMEQMK YRDRAAERR 

EKYGIPEPPEPKRRKYGGISTASVDFEQPTRDGLG 

SDNIGSRMLQAMGWKEGSGLGRKKQGIVTPIEA 

QTRVRGSGLGARGSSYGVTSTESYKETLHKTMV 

TRFNEAQ 


3235 


A 


3 


1217 


PSFLNTGLGPTALGVLGGAGAGLMSNPSPQVPEE 

EASTSVCRPKSSMASTSRRQRRERRFRRYLSAGR 

LVRAQALLQRHPGLDVDAGQPPPLHRACARHD 

APALCLLLRLGADPAHQDRHGDTALHAAARQG. 

PDAYTDFFLPLLSRCPSAMGIKNKDGETPGQELG 

WGPPWDSAEEEEEDDASKEREWRQKLQGELED 

EWQEVMGRFEGDASHETQEPESFSAWSDRLARE 

HAQKCQQQQREAEGSCRPPRAEGSSQSWRQQEE 

EQRLFRERARAKEEELRESRARRAQEALGDREP 

ICPTRAGPREEHPRGAGRGSLWRFGDVPWPCPGG 

GDPEAMAAALVARGPPLEEQGALRRYLRVQQV 

RWHPDRFLQRFRSQEETWELGRVMGAVTALSQA 

LNRHAEALK 


3236 


A 


3 


1416 


GPASGMAEPTSDFETPIGWHASPELTPTLGPLSDT 

APPRDRWMFWAMLPPPPPPLTSSLPAAGSKPSSE 

SQPPMEAQSLPGAPPPFDAQILPGAQPPFDAQSPL 

DSQPQPSGQPWNFHASTSWYWRQSSDRFPRHQK 

SLNPAVKNSYYPRKYDAKFTDFSLPPSRKQKKK 

KRKEPVFHFFCDTCDRGFKNQEKYDKHMSEHTK 

CPELDCSFTAHEKIVQFHWRNlvlHAPGMKKJKLD 

TPEE1ARWREERRKNYPTLANIERKKKLKLEKEK 

RGAVLTTTQYGKMKGMSRHSQMAKIRSPGKNH 

KWKNDNSRQRAVTGSGSHLCDLKLEGPPEANA 

DPLGVLINSDSESDKEEKPQHSVIPKEVTPALCSL 

MSSYGSLSGSESEPEETPIKTEADVLAENQVLDSS 

APKSPSQDVKATVROTSEAKSENRKKSFEKTNPK 

REKRLSQLSNVIRTKNTPSISLGNASSSGHST 


3237 


A 


3806 


2204 


FVGEQEGGCEAGAGRGAQTYPGEAGERWFGRR 

RRRGRVVSRICKMSLKSERRGIHVDQSDLLCKKG 

CGYYGNPAWQGFCSKCWREEYHKARQKQIQED 

WELAERLQREEEEAFASSQSSQGAQSLTFSKFEE 

KKTNEKTRKVTTVKKFFSASSRVGSKXEIQEAICA 

PSPSINRQTSIETDRVSKEFIEFLKTFHKTGQEIYK 

QTKLFLEGMHYKRDLSIEEQSECAQDFYHNVAE 

RMQTRGKVPPERVEKIMDQIEKYIMTRLYKYVF 

CPETTDDEKKDLAIQKRTRALRWVTPQMLCVPV 

NEDIPEVSDMVVI<AITDIIEMDSKRVPRDKLACIT 

KCSKHIFNAIKITKNEPASADDFLPTLIYIVLKGNP 

PRLQSNIQYITRFCNPSRLMTGEDGYYFTNLCCA 

VAFIEKLDAQSLNLSQEDFDRYMSGQTSPRKQEA 

ESWSPDACLGVKQMYKNLDLLSQLNERQERIMN 

EAKKLEKDLEDWTDGIAREVQDIVEKYPLEIKPP 

NQPLAAIDSENVENDKLPPPLQPQVYAG 


3238 


A 


1373 


449 


VLSVCPTGVFRPAPCRMAFMKKYLLPELGLFMA 
YYYYSANEEFRPEMLQGKKVIVTGASKGIGREM 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
• corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
. nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine,D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, OGlycine, H=Histidine, 
I^lsoleucine, K«Lysine, L^Leucine, MHVlethiontne, 
N=Asparagine, P=ProIine, Q=Glutaminc, R«Arginine, S=Serine, 
T=Thrconine, V=*Valinc, W^Tryptophan, Y^Tyrosine, 
X=UnUnown, *«=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










AYHLAKMGAHVVVTARSKETLQKWSHCLELG 
AASAHYIAGTMEDMTFAEQFVAQAGKLMGGLD 
MLILNHITNTSLNLFHDDIHHVRKSMEVNFLSYV 

AYSASKFALDGFFSSIRKEYSVSRVNVSITLCVLG 
LIDTETAMKAVSGIVHMQAAPBCEECALEIIKGGA 

T PriFPVWHQQT WTTT T TDXTD^T5 VTT "CfT VCTOW 

Livycn VII UooL, Wll JULJJuNrUJKJ^Lnr .L Yo 1 bYN 
MDRFINK 


3239 


A 


213 


499 


PPTA/tOT PT W A T "MT7tnrV7 VKTIVT T YI77/"MYT vvzr±r? A 

xzi\ l M y JL/JdJUn. V AIjJN r Ilr Y L Y IN ivLL W/^ r LKKK* E A 
HWYPDKPLKGSGFHT/GEMVDPVGELAAKRSGL 
TVED 


3240 


A 


1255 


1425 


HESYHVNPNLCNPVAPTSGAHSIG*KWPSWLGA 
VArioUINro li^VvjKvaOKllKvj^iil^K 


3241 


A 


161 


547 


PAGIGRSTAKTPGTPGSLEMENLKSGVYPLKEAS 
Utru AUruNJLJL V i or i xiKUFL 1 rssL) VAlbr oLilE W 
QCLDTAQQDLYRKVMLENYRNLVFLAGIAVSKP 
DLITCLEQGKEPWNMKRHAMVDQPPGR 


3242 


A 


50 


241 


PLPARGKSTLPATFCSPSAPELASMSVVPPNRSQT 
U W rKU V 1 v^r CjJN R Y 1 Tl^LTLERTINL 


3243. 


A 


380 


702 


FVAYLIG^PFFSQVCLFASSEMFFTISRKNMSQKLS 
LLLLVFGLIWGLMLLHYTFQQPRHQSSVKLREQI 
LDLSKRYVKALAEENKNTVDVENGASMAGYGK 
ITVEYF 


3244 


A . 


37. 


1391 


VLMDGRMMRSMRLREEESPGPSHTASCLCG SAP 

CILCSCCPASRNSTVSRLIFTFFLFLGVLVSIIMLSP 

GVESQLYKLPWVCEEGAGIPTVLQGHIDCGSLLG . 

YRAVYRMCFATAAFFFFFTLLMLCVSSSRDPRA 

AIQNGF WFFKFLILVGLTVGAFYIPDGSFTNIWFY 

FGVVGSFLFILIQLVLLIDFAHSWNQRWLGKAEE 

CDSRAWYAGLFFFTLLFYLLSIAAVALMFMYYT 

EPSGCHEGKVFISLNLTFCVCVSIAAVLPKVQDA 

QPNSGLLQASVITLYTMFVTWSALSSIPEQkCNP 

rlLr 1 ^LuNb 1 V V AGrPEG YETQ WWDAPSrv GLIIF 

LLCTLFISLRSSDHRQVNSLMQTEECPPMLDATQ 

QQQQVAACEGRAFDNEQDGVTYSYSFFHFCLVL 

ASLHVMMTLTNWYKPGETRKMISTWTAVWVKI 

r* A QXk7 A m T T VT 


3245 . 


A 


52 - 


426 


SSLGNEDDEILSLAKDITGMFVASHRKMRAHQV 
LTFLLLFVITSVASENASTSRGCGLDLLPQYVSLC 
DLDAIWGIVVEAAAGAGAL1TLLLMLILLVRLPF 
FKEKEiaCSPVGLHFLFLLGTLGP 


3246 


A 


3 




HEVCGSGCCCHCCAGGPVARQKALPRLRGVMS 

OUT XT\rT OCYI7T T/"K>T\7CTT A X JT/^XTTT AOTT1 T*\T T' 1 T'T ~\rT*T7- 

Kr Lin V LKo WLVMV S1JAMGNTLQSFRDHTFL YEK 

LYTGI<^NLWGLQARTFGIWTLLSSVIRCLCAIDI 

HNKTLYHITLWTFLLALGHFLSELFVYGTAAPTI 

GVLAPLMVASFSILGMLVGLRYLEVEPVSRQKK 

RN 


3247 


A 


1 


932 


ERLCFPCMQSKIYSYMSPNKCSGMRFPLQEENSV 
THHEVKCQGKPLAGIYRKREEKRNAGNAVRSA 
MKSEEQKIKDARKGPLVPFPNQKSEAAEPPKTPP 
SSCDSTNAAIAKO ALKKPnCGKOAPRKK AOGKT 
QQNRKLTDFYPVRRSSRKSKAELQSEERKRIDELI 
ESGKEEGMKIDLIDGKGRG V1ATKQFSRGDF V VE 
YHGDLIEITDAKKPJEALYAQDPSTGCYlvlYYFQY 
LSKTYCVDATRETNRLGl^D^SKCGNCQTKLH 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, KNLysine, L^Leucine, M=Methionine, 
N^Asparagine, P=Proline, Q=Glutnmine, R=Argininc, S=Scrinc, 
T^Threonine, V^Valine, W=Tryptophan, Y^Tyrosine, 
X^Un known, * = Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










DIDGVPHLILIASRDIAAGEELLYDYGDRSICASIE 
AHPWLKH 


3248 


A 


3 


870 


PGSTISCSELKGTQCRATAGSRGRRPPMTCWLRG 

VTATFGRPAEWPGYLSHLCGRSAAMDLGPMRK 

SYRGDREAFEETHLTSLDPVKQFAAWFEEAVQC 

PDIGEANAMCLATCTRDGKPSARMLLLKGFGKD 

GFRFFTNFESRKGKELDSNPFASLVFYWEPLNRQ 

VRVEGPVKKLPEEEAECYFHSRPICSSQIGAWSH 

QSSV1PDREYLRKKNEELEQLYQDQEVPKPKSW 

GGYVLYPQVMEFWQGQTNRLHDRIVFRRGLPTG 

DSPLGPMTHRGEEDWLYERLAP 


3249 


A 


43 


1210 


TRVGRGESGLKMEVKPPPGRPQPDSGRRRRRRG 

EEGHDPKEPEQLRKLFIGGLSFETTDDSLREHFEK 

WGTLTDCVVMRDPQTKRSRGFGFVTYSCVEEV 

DAAMCARPHKVDGRWEPKRAVSREDSVKPGA 

HLTVKKIFVGGIKEDTEEYNLRDYFEKYGKIETTE 

VMEDRQSGKKRGFAFVTFDDHDTVDKIVVQKY 

HTINGHNCEVKKALSKQEMQSAGSQRGRGGGS 

GNFMGRGGNFGGGGGNFGRGGNFGGRGGYGG 

GGGGSRGSYGGGDGGYNGFGGDGGNYGGGPG 

YSSRGGYGGGGPGYGNQGGGYGGGGGYDGYN 

EGGNFGGGNYGGGGNYNDFGNYSGQQQSNYGP 

MKGGSFGGRSSGSPYGGGYGSGGGSGGYGSRRF 


3250 


A 


32 


1175 


VAGRGDMAALRDAEIQKDVQTYYGQVLKRSAD 

LQTNGC VTTARPVPKHIREALQNVHEEV ALRY Y 

GCGLVIPEHLENCWILDLGSGSGRDCYVLSQLVG 

EKGHVTGIDMTKGQVEVAEKYLDYHMEKYGFQ 

ASNVTFIHG YffiKLGEAGiKNESHDIV V'SNC VINL 

VPDKQQVLQEAYRVLKHGGELYFSDVYTSLELP 

EEIRTHKVLWGECLGGALYWKELAVLAQKIGFC 

PPRLVTANLITIQNKELERVIGDCRFVSATFRLFK 

HSKTGPTKRCQVIYNGGITGHEKELMFDANFTFK 

EGEIVEVDEETAAILKNSRFAQDFLIRPIGEKLPTS . 

GGCSALELKDIITDPFKLAEESDSMKSRCVPDAA 

GGCCGTKKSC 


3251 


A 


32 


1175 


VAGRGDMAALRDAE1QKDVQTYYGQVLKRSAD 

LQTNGCVTTARPVPKHIREALQNVHEEVALRYY 

GCGLVIPEHLENCWILDLGSGSGRDCYVLSQLVG 

EKGHVTGIDMTKGQVEVAEKYLDYHMEKYGFQ 

ASNVTFIHGYIEKLGEAGIKNESHDIVVSNCVINL 

VPDKQQVLQEAYRVLKHGGELYFSDVYTSLELP 

EEIRTHKVLWGECLGGALYWKELAVLAQKIGFC 

PPRLVTANLITIQNKELERVIGDCRFVSATFRLFK 

HSKTGPTKRCQVIYNGGITGHEKELMFDANFTFK 

EGEIVEVDEETAAILKNSRFAQDFLIRPIGEKLPTS 

GGCSALELKDIITDPFKLAEESDSMKSRCVPDAA 

GGCCGTKKSC 


3252 


A 


1 


574 


PLGSNTAPALRVMVQAWYMDDAPGDPRQPHRP 
DPGRPVGLEQLRRLGVLYWKLDADKYENDPELE 
KIRRERNYSWNIDnTICKDKLPNYEEKEKMFYEE 










HLHLDDblK I JUJLKj bKj Y r D V RDl^DQ WlKJxMbK. 
GDMVTLPAGIYHRFTVDEKNYTKAMRLFVGEPV 
WTAYNRPADHFEARGQYVKFLAQTA 


3253 


A 


2 


984 


ARAAAHCGICRLVRWWRKRRSVMGIQTSPVLLA 
SLGVGLVTLLGLAVGSYLVRRSRRPQVTLLDPNE 



294 



WO 01/57190 



PCTAJS01/04098 



SEQJD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine C=Cystcinc, D=Aspartic Acid, 
£=Glutamic Acid, ^Phenylalanine, G-Glycine, H=Histidine, 
I=IsoIeucinc, K=Lysine, Lr=Leucine, M-Methionine, 
N^AsparagincP^Proline, Q=Glutamine, R=Arginine, S=Serine, 
T«Threonine, V=VaIine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










KYLLRLLDKTTVSHNTKRFRPALPTAHHTLGLPV 

GKHIYLSTRIDGSLVIRPYTPVTSDEDQGYVDLVI 

KVYLKGVHPKFPEGGKMSQYLDSLKVGDWEF 

RGPSGLLTYTGKGHTOIQPNKKSPPEPRVAKKLG 

MIAGGTGITPMLQLIRAILKVPEDPTQCFLLFANQ . 

TEKDnLREDLEELQARYPNRFKLWFTLDHPPKD 

WAYSKGFVTADMIREHLPAPGDDVLVLLCGPPP 

MVQLACHPNLDKLGYSQKMRFTY 


3254 


A 


1 


968 


LQSAGEGVTHVLILLESPARPVAAVTQVQRRRY 

HRLSDMSMLAERRRKQKWAVDPQNTAWSNDD 

SKFGQRMLEKMGWSKGKGLGAQEQGATDHIKV 

QVKNNHLGLGATINNEDNW1AHQDDFNQLLAEL 

NTCHGQEITDSSDKKEKKSFSLEEKSKISKNRVH 

YMKFTKGKDLSSRSKTDLDCIFGKRQSKKTPEG 

DASPSTPEENETTTTSAFTIQEYFAKRMAALKNK 

PQVPVPGSDISETQVERKRGKKRNKEATGKDVE 

SYLQPKAKRHTEGKPERAEAQERVAKKKSAPAE 

EQLRGPCWDQSSICASAQDAGDHVQPA 


3255 


A 


173 ■ 


439 


GSAAMKVKIKCWNGVATWLWVANDENCGICR 

MAFNGCCPDCKVPGDDCPLVWGQCSHCFHMHC 

ILKWLHAQQVQQHCPMCRQEWKFKE 


3256 


A 


2 


377 


TAARRRQKGTAARRRQKGTLEEVVLPPRSCRVF 
WIHSGTTMSKVSFKITLTSDPRLPYKVLSVPESTP 
FTAVLKFAAEEFKVPAATSAIITNDGIGINPAQTA 
GNVFLKHGSELRIIPRDRVGSC 


3257 


A 


3. 


1454 


GCSAAAAGAGSGPWAAQEKQFPPALLSFFIYNPR 

FGPREGQEENKILFYHPNEVEKNEKIRNVGLCEAI 

VQFTRTFSPSKPAKSLHTQKNRQFfTSTEPEENFWM 

VMVVRNPIIEKQSKDGKPVIEYQEEELLDKVYSS 

VLRQCYSMYKLFNGTFLKAMEDGGVKLLKERL 

EKFFHRYLQTLHLQSCDLLDIFGGISFFPLDKMTY 

LKIQSFINRMEESLNIVKYTAFLYNDQLIWSGLEQ 

DDMRILYKYLTTSLFPRHIEPELAGRDSPIRAEMP 

GNLQHYGRFLTGPLNLNDPDAKCRFPKIFVNTD 

DTYEELHLIVYKAMSAAVCFMIDASVHPTLDFC 

RRLDSIVGPQLTVLASDICEQFNINICRMSGSEKEP 

QFKFIYFNHlVENLAEKSTVHMRKTPSVSLTSVHiPD 

LMKILGDINSDFTRVDEDEEIIVKAMSDYWVVG 

KKSDRRELYVILNQKNANLIEVNEEVKKLCATQF 

NNIFFLD 


3258 . 


A 


113 


1558 


APRGCSMPHRKKKPFIEKKKAVSFHLVHRSQRD 

PLAADESAPQRVLLPTQKIDNEERRAEQRKYGVF 

FDDDYDYLQHLKEPSGPSELIPSSTFSAHNRREEK 

EETLVBPSTGIKLPSSVFASEFEEDVGLLNKAAPV 

SGPRLDFDPDIVAALDDDFDFDDPDNLLEDDFIL 

QANKATGEEEGMDIQKSENEDDSEWEDVDDEK 

GDSNDDYDSAGLLSDEDCMSVPGKTHRAIADHL 

FWSEETKSRFTEYSMTSSVMRRNEQLTLHDERFE 

KFYEQYDDDEIGALDNAELEGSIQVDSNRLQEVL 

NDYYKEKAENCVKLNTLEPLEDQDLPMNELDES 

EEEEMITWLEEAKEKWDCESICSTYSNLYNHPQ 

LIKYQPKPKQIRISSKTGIPLNVLPKKGLTAKQTE 

RIQMINGSDLPICVSTQPRSICNESKEDKRARKQAI 

KEERKJERRVEKKANKLAFKLEKRRQEKELLNLK 

KNVEGLKL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteinc, D=Aspartic Acid, 
E=Glutaimc Acid, F=PhenylaIanine, G=Glycine, H^Histidine, 
Msoleucine, KpLysine, L»Leucine, M-Methioninc, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T^Threonine, V=Valine, \V=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possibIe nucleotide deletion, 
V=possible nucleotide insertion - 


3259 


A 


3 


964 


QMEPGNDTQISEFLLLGFSQEPGLQPFLFGLFLSM 

YLVTVLGNLLIILAT1SDSHLHTPMYFFLSNLSFA 

DICVTSTTPKMLMNIQTQNKVITYL^CLMQMYF 

FILFAGFENFLLSVMAYDRFVAICHPLHYMVIMN 

PHLCGLLVLASWTMSALYSLLQILMVVRLSFCT 

ALEIPHFFCELNQVIQLACSDSFLNHMVIYFTVAL 

LGGGPLTGILYSySKIISSIHAISSAQGKYKAFSTC 

ASHLSWSLFYGAILGVYLSSAATRNSHSSATAS 

VMYTWTPMLNPFIYSLRNKDIKRALGIHLLWGT 

MKGQFFKKCP 


3260 


A 


34 


2573 


IPFLKSCCCCCLFDFPPPPLDQVQEEECEVERVTE 

HGTPKPFRKFDSVAFGESQSEDEQFENDLETDPP 

NWQQLVSREVLLGLKPCEIKRQEVINELFYTERA 

HVRTLKVLDQVFYQRVSREGILSPSELRKIFSNLE 

DILQLHIGLNEQMKAVRKRNETSVIDQIGEDLLT 

WFSGPGEEKLKHAAATFCSNQPFALEMIKSRQK 

KDSRFQTFVQD AESNPLCRRLQLKDIIPTQMQRL 

TKYPLLLDN1ATYTEWPTEREKVKKAADHCRQ1L 

NYVNQAVKEAENKQRLEDYQRilLDTSSLKLSEY 

PNVEELRNLDLTKRKMIHEGPLVWKVNRDKTID 

LYTLLLEDCLVLLQKQDDRLVLRCHSKILASTAD 

SICHTFSPVIKLSTVLVRQVATDNKALFVISMSDN 

GAQIYELVAQTVSEKTVWQDLICRMAASVKEQS 

TKPIPLPQSTPGEGDNDEEDPSKLKEEQHGISVTG 

LQSPDRDLGLESTLISSKPQSHSLSTSGKSEVRDL 

FVAERQFAKEQHTDGTLKEVGEDYQIAIPDSHLP 

VSEERWALDALRNLGLLKQLLVQQLGLTEKSVQ . 

EDWQHFPRYRTASQGPQTDSVIQNSENIKAYHSG 

EGHMPFRTGTGDIATCYSPRTSTESFAPRDSVGL 

APQDSQASNILVMDHMIMTPEMPTMEPEGGLDD 

SGEHFFDAREAHSDENPSEGDGAVNKEEKDVNL 

RISGNYLILDGYDPVQESSTDEEVASSLTLQPMT 

GIPAVESTHQQQHSPQNTHSDGAISPFTPEFLVQQ 

RWGAMEYSCFEIQSPSSCADSQSQEMEYIHKIEA 

DLEHLKKVEESYT1LCQRLAGSALTDKHSDKS 


3261 


A 


1 


2100 


AVEFAEGALTMAPWPELGDAQPNPDKYLEGAA 

GQQPTAPDKSKETNKTDNTEAPVTIOELLPSYST 

ATLIDEPTEVDDPWNLPTLQDSGIKWSERDTKGK 

ILCFFQGIGRLILLLGFLYFFVCSLDILSSAFQLVG 

GKMAGQFFSNSSIMSNPLLGLVIGVLVTVLVQSS 

STSTSIVVSMVSSSLLTVRAAIPIIMGANIGTSITNT 

IVALMQVGDRSEFRRAFAGATVHDFFNWLSVLV 

LLPVEVATHYLEITTQLIVESFHFKNGEDAPDLLK 

VITKPFTKLIVQLDKKVISQIAMNDEKAKNICSLV 

m^CKTFTNKTQI>TVTVPSTANCTSPSLCWTDGI 

QNWTMKNVTYKENLAXCQrnFWFHLPDLAVGT 

ILLILSLLVLCGCLIMIVKILGSVLKGQVATVIKKT . 

INTDFPFPFAWLTGYLADLVGAGMTFIVQSSSVFT 

SALTPLIGIGVITIERAYPLTLGSNIGTTTTA1LAAL 

ASPGNALRSSLQIALCHFFFNISGILLWYPIPFTRL 

PIRMAKGLGMSAKYRWFAVFYLIIFFFLIPLTVFG 

LSLAGWRVLVGVGVPWFIIILVLCLRLLQSRCPR 

VLPKKLQNWNFLPLWMRSLKPWDAVVSKFTGC 

FQMRCCCCCRVCCRACCLLCGCPKCCRCSKCCE 

DLEEAQEGQDVPVKAPETFDNITISREAQGEVPA 
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SEQU) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanme C=Cysteine, D=Aspartic Acid, 
E=Glutaraic Acid, F=Phenylalanine, G=G)ycine, H=Histidinc, 
I=Isoleucine, K=Lysine, L=Lcucine, M^Mcthionine, 
N=Asparaginc, P=Proline, Q=Glutamine, R=Arginihe, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y^Tyrosine, 
X^linknown, *=Stop codon, ^possible nucleotide deletion, 
^possible nucleotide insertion 










SDSKTECTAL 


3262 


A 


3Q 


1377 


SQQGSQPHRQGPPSLLTAPHSLDLPALPPGPRGS 

QGKLRRVLVPMSVKPSWGPGPSEGVTAVPTSDL 

GEITOWTELLDLFNHTLSECHVELSQSTKRVVLF 

AL YL AMF V V GL VENLL VIC VNWRGSGRAGLMN 

LYILNMAIAJDLGWLSLPVWMLEVTLDYTWLWG 

SFSCRFTrnTYFVNMYSSrr^ 

ASPSWQRYQHRVRRAMCAGIWVLSAIIPLPEVV 

HIQLVEGPEPMCLFMAPFETYSTWALAVALSTTI 

LGFLLPFPLITVFNVLTACRLRQPGQPKSRRHCLL 

LCAYVA VFVMCWLPYHVTLLLLTLHGTHISLHC 

HLVHLLYFFYDVIDCFSMLHCVINPILYNFLSPHF 

RGRLLNAVVHYLPKDQTKAGTCASSSSCSTQHSI 

IITKGDSQPAAAAPHPEPSLSFQAHHLLPNTSPISP 

TQPLTPS 


3263 


A 


1 


919 


QARSPSVAAMASPQLCRALVSAQWVAEALRAP 

RAGQPLQLLDASWYLPKLGRDARREFEERHIPG 

AAFFDIDQCSDRTSPYDHMLPGAEHFAEYAGRL 

GVGAATHVVrYDASDQGLYSAPRVWWMFRAFG 

HHAVSLLDGGLRHWLRQNLPLSSGKSQPAPAEF . 

RAQLDPAFIKTYEDIKENLESRRFQVVDSRATGR 

FRGTEPEPRDGiEPGHIPGTVNIPFTDFLSQEGLEK 

SPEEIRHLFQEKKVDLSKPLVATCGSGVTACHVA 

LGAYLCGKPDWIYDGSWVEWYMRARPEDVISE 

GRGKTH . 


3264 


A 


1 


1398 


ARRSTPRTAPRASATRSAAGTMREIVHIQAGQCG 

NQIGAKFWEVISDEHGIDPTGSYHGDSDLQLERI 

NVYYNEAAGNKYVPRAILVDLEPGTMDSVRSGP 

FGQIFRPDNFVFGQSGAGNNWAKGHYTEGAELV 

DSVLDVVRKESESCDCLQGFQLTHSLGGGTGSG 

MGTLLISKIREEYPDRIMNTFSVMPSPKVSDTVVE 

PYNATLSVHQLVENTDETYSIDNEALYDICFRTL 

KLTTPTYGDLNHLVSATMSGVTTCLRFPGQLNA 

DLRKLAVNMVPFPRLHFFMPGFAPLTSRG SQQY 

RALTWELTQQMFDSKNMMAACDPRHGRYLTV 

AAIFRGRMSMKEVDEQMLNVQNKNSSYFVEWIP 

NNVKTAVCDIPPRGLKMSATFIGNSTAIQELFKRI 

SEQFTAMFRRKAFLHWYTGEGNIDEMEFTEAES 

NMNDLVSEYQQYQDATADEQGEFEEEEGEDEA 


3265 .. 


A 


265 


862 


WWEDARVLGPFHPEEEGHWVMTPSEGARAGTG 

RELEMLDSLLALGGLVLLRDSVEWEGRSLLKAL 

VKKSALCGEQVHILGCEVSEEEFREGFDSDINNR 

LVYHDFFRDPLNWSKTEEAFPGGPLGALRAMCK 

RTDPVPVTIALDSLSWLLLRLPCTTLCQVLHAVS 

HQDSCPGETPPSLFPLIHLPLPRSVPLFLSTLE 


3266 


A 


2 


884 


AAGAGADGREPASERASRAEPPAVAMGQNDLM 

GTAEDFADQFLRVTKQYLPHVARLCLISTFLEDG 

IRMWFQWSEQRDYIDTTWNCGYLLASSFVFLNL 

LGQLTGCVLVLSRNFVQYACFGLFGI1ALQTIAYS 

ILWDLKFLMRNLALGGGLLLLLAESRSEGKSMF 

AGVPTMRESSPKQYMQLGGRVLLVLMFMTLLH . 

PTDASFFSIVQNIVGTALMILVAIGFKTKLAALTLV 

VWLFAI>rVYFNAFWTIPVYKPMHDFLKYDFFQT 

MSVIGGLLLVVALGPGGVSMDEKKKEW 


3267 


A 


802 


1011 


ASTFCSAWKRRSTAALWWSGSRASRSHPRELGP 
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SEQID 

NO: . 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alamne C= t Cysteint, D=Aspartic Acid, 
E-Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, L= Leu cine, M=MethioniDe, 
N«Asparagine, P«=Proline, Q=G I u taurine, R=Arginine, S^erine, 
T=Threonine, V«Valine, W=Tryptophan, Y^Tyrosine, 
X = Unknown, *~Stop codon, /—possible nucleotide deletion, 
\= possible nucleotide insertion 










LCFVFGTAALSIRSMDVLSLFLEHGKLVFASGLSP 
RA 


3268 . 


A 


490 


679 


EDAWITNPSLSNARSTPSKPLCYTVLKEGQWGV 
KTTKASNTREKLRPESERRMVKSFGDEVT 


3269 


A 


2 


796 


GSTHASGARPSLKRARSQRGRPLPSRALPSAHKD 

MTTNAGPLHPYWPQHLRLDNFVPNDRPTWHILA 

GLFSVTGVLVVTTWLLSGRAAVVPLGTWRRLSL 

CWFAVCGFIHLVIEGWFVLYYEDLLGDQAFLSQ 

LWKEYAKGDSRYILGDNFTVCMETITACLWGPL 

SLWVVIAFLRQHPLRFILQLWSVGQIYGDVLYF 

LTEHRDGFQHGELGHPLYFWF^TVFMNALWLV 

LPG VL VLD A VKHLTHA Q S TLD AKA TKAKSKKN 


3270 


A 


17 


229 . 


GDTGPQILMSYLDSVASKLLQMVKKLSQSFCSNF 
KYLTKYSRKQVSDEIKKSRRTVESNPIFFKKNKKI 
Q 


3271 


A 


419 


553 


IQSGLSLCFADLSETPEGRAGVPGCPHSCDGVAS 
GRPCSPSSAG 


3272 


A 


1211 


1450 


FQFIQIELLNILQSLIRNQTQSPYNTTAYPAIDSVIT 
ILPFSFSCFFnTKCFGLSIFPSVIFFLHVYFILTLVVF 
YCC 


3273 


A 


59 


1562 


QAWSLQVALSPFFFPASPSNSFAAAVPQLLFPELP 

LPHWGQESAKRRSARRFLIMSELTKELMELVW 

GTKSSPGLSDTIFCRWTQGFVFSESEGSALEQFEG 

GPCAV1APVQAFLLKKLLFSSEKSSWRDCSQEEQ 

KELLCHTLCDILESACCDHSGSYCLVSWLRGKTT 

EETASISGSPAES SCQ VEHSSALAVEELGFERFHA 

L1QKRSFRSLPELKDAVLDQYSMWGNKFGVLLF 

LYSVLLTKGIENDCNEIEDASEPLIDPVYGHGSQS 

LINLLLTGHAVSNVWDGDRECSGMKLLGIHEQA 

AVGFLTLMEALRYCKVGSYLKISKPYLDCLASE 

THLTVFFAKDMALVAPEAPSEQARRVFQTYDPE 

DNGFIPDSLLEDVMKALDLVSDPEYINLMKNI<L 

DPEGLGEfLLGPFLQEFFPDQG SSGPESFTVYHYN 

GLKQSNYNEKVMYVEGTAVVMGFEDPMLQTD 

DTPIKRCLQTKWPYEELLWTTDRSPSLN 


3274 


A 


186 


1358, 


RVVHRFFKSSAFWPAEVKQPRGGPKTGSRKEGA 

GSRAPQPVVRSFCGSVGAEGRMEKLRLLGLRYQ 

EYVTRHPAATAQLETAVRGFSYLLAGRFADSHE 

LSELVYSASNLLXHLLNDGILRKELRKKLPVSLSQ 

QKLLTWLSVLECVEVFMEMGAAKVWGEVGRW 

LVIALIQLAKAVLRMLLLLWFKAGLQTSPPr/PL 

DRETQAQPPDGDHSPGNHEQSYVGKRSNRWRT 

LQNTPSLHSRHWGAPQQREGRQQQHHEELSATP 

TPLGLQETIAEFLYIARPLLHLLSLGLWGQRSWK 

PWLLAGVVDVTSLSLLSDRKGLTRRERRELRRR 

TILLLYYLLRSPFYDRFSEARDLFLLQLLADHVPG 

VGLVTRPLMDYLPTWQKIYFYSWG 


3275 


A 


575 


759 


SVYSASSCKCCNYRKTEQIPDCEQPPASSMPERPS 
HESQPTPQMMPLSAPSRAEELGQRPG 


3276 


A 


7 


258 


. KAAGHRLLLAAGHPSMPSSDCLLWEGSLELRPL 
QHI S SLL VL V SI rCLr AFPRVP1AF ESKSCL1 Y HCH 
CAFTVRHYMCSSHTG 


3277 


A 


9 


2221 


KLGVEPEEEGGGDDEEDAEAWAMELADVGAAA 

SSQGVHDQVLPTPNASSRVIVHVDLDCFYAQVE 

MISNPELKDKPLGVQQKYLVVTCNYEARKLGVK 
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SEQU) 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
* corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
. nucleotide 
location . . 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine» H=Histidine, 
I=Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N-Asparagine, P=Pro1ine, Q=Glutamine, R=Arginine, S=Serine, 
T^Threoninc, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










KLMNVRDAKEKCPQLVLVNGEDLTRYREMSYK 

VTELLEEFSPWERLGFDENFVDLTEMVEKRLQQ 

LQSDELSAVTVSGHVYNNQSINLLDVLHIRLLVG 

SQIAAEMREAMYNQLGLTGCAGVASNKLLAKL 

VSGVFKPNQQTVLLPESCQHLIHSLNHEKEIPGIG 

YKTAKCLEALGINSVRDLQTFSPKILEKELGISVA 

QRIQKLSFGEDNSPVILSGPPQSFSEEDSFKKCSSE 

VEAKNK1EELLASLLNRLCQDERKPHTVRLIIRRY 

SSEKHYGRESRQCPIPSHVIQKLGTGNYDVMTPM 

VDILMKLFIO^VWKMPFHLTLLSVCFCNLKAL 

NTAKKGLIDYYLMPSLSTTSRSGKHSFKMKDTH 

MEDFPKDKETNRDFLPSGRIESTRTRESPLDTTNF 

SKEKDINEFPLCSLPEGVDQEVFKQLPVDIQEEIL 

SGKSREKFQGKGSVSCPLHASRGVLSFFSKKQM 

v^DIPINPRDHLSSSKQVSSVSPCEPGTSGFNSSSSS 

YMSSQKDYSYYLDNRLKDERJSQGPKEPQGFHF 

TNSNPAVSAFHSFPNLQSEQLFSRNHTTDSHKQT 

VATDSHEGLTENREPDSVDEKJTFPSDIDPQVFYE 

LPEAVQKELLAEWKRTGSDFHIGHK 


3278 


A 


1 


876 


GLRLHVDLVEKPRTGIMAAETRNVAGAEAPPPQ 

KRYYRQRAHSNPMADHTLRYPVICPEEMDWSEL 

YPEFFAPLTQNQSHDDPKDKKEKRAQAQVEFAD 

IGCG YGGLL VELSPLFPDTLELGLEIRVKVSDYVQ 

UKJRALRAAPAGGFQNIACLRSNAMKHLPNFFY 

KGQLTKMFFLFPDPHFKRTKHKWRIISPTLLAEY - 

AYVLRVGGLVYTITDVLELHDWMCTHFEEHPLF 

ERVPLEDLSEDPVVGHLGTSTEEGKKVLRNGGK 

NFPAIFRRIQDPVLQAVTSQTSLPGH 


3279 . 


A 


82 


2929 


TRTKRRLGREKAMA SPPRG WGCGELLLPFMLLG 

TLCEPGSGQIRYSMPEELDKGSFVGNIAKDLGLE 

PQELAERGVRIVSRGRTQLFALNPRSGSLVTAGRI 

DREELCAQSPLCVVNFNILVENKMKIYGVEVEII 

DINDNFPRFRDEELKVKVNENAAAGTRLVLPFA 

RDADVGYNSLRSYQLSSNLHFSLDVVSGTDGQK 

YPELVLEQPLDREKETVHDLLLTALDGGDPVLSG 

TTHIRVTVLDANDNAPLFTPSEYSVSVPENIPVGT 

RLLMLTATDPDEGINGKLTYSFRNEEEKISETFQL 

DSNLGEISTLQSLDYEESRFYLMEWAQDGGAL 

VASAKWVTVQDVNDNAPEVILTSLTSSISEDCL 

PGTVIALFSVHDGDSGENGEIACSIPRNLPFKLEK 

SVDNYYHLLTTRDLDREETSDYMTLTVMDHGT 

PPLSTESHIPLKVADVMDNPPNFPQASYSTSVTEN 

NPRGVSIFSVTAHDPDSGDNARVTYSLAEDTFQG 

APLSSYVSINSDTG VLYALRSFDYEQLRDLQLWV 

TASDSGNPPLSSNVSLSLFVLDQNDNTPEIL YPAL 

PTDGSTGVELAPRSAEPGYLVTKVVAVDKDSGQ 

NAWLSYRLLICASEPGLFAVGLHTGEVRTARALL 

DRDALKQSLVVAVEDHGQPPLSATFTVTVAVAD 

RIPDILADLGSIKTPIDPEDLDLTLYLWAVAAVS 

CVFLAFVIVLLVLRLRRWHKSRLLQAEGSRLAG 

VPASHFVGVDGVRAFLOTYSHEVST TAD^RK^H 

LIFPQPm^ADTLLSEESCEKSEPLLMSDKVDANK . 
EERRVQQAPPNTDWRFSQAQRPGTSGSQNGDDT 
GTWPNNQFDTEMLQAMILASASEAADGSSTLGG 
GAGTMGLSARYGPQFTLQHVLQGELGSDYRQN 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
• nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCystcine, D=Aspartic Acid, 
E^GIutamic Acid, F=Ph cny lala nine, G=Glycine, H«Histidine, 
I=Isoleucine, K=Lysine, LHLeucine, M=Methionine, 
N^Asparagine, P=Proline, Q=Glutaminc, R=Arginine, S=Serine, 
T=Threonine, V*=Valine, W^Tryptopban, Y^Tyrosine, 
X^Un known, * = Stop codon, A^possiblc nucleotide deletion, 
V=possible nucleotide insertion 










VYIPGSNATLTNAAGKRDGKAPAGGNGNKKKS 
GKKEKK 


3280 


A 


149 


1288 


GTSQMSSHKGSWAQGNGAPASNREADTAELAE 

LGPLLEEKGKRVIANPPKAEEEQTCPVPQEEEEE 

VRVLTLPLQAHHAMEKMEEFVYKVWEGRWRVI 

PYDVLPDWLKDNDYLLHGHRPPMPSFRACFKSIF 

RIHTETGNIWTHLLGFVLFLFLGILTMLRPNMYF 

MAPLQEKVVFGMFFLGAVLCLSFSWLFHTVYCH 

SEKVSRTFSKLDYSGIALLIMGSFVPWLYYSFYCS 

PQPRLIYLSIVC VLGISAIIVAQWDRFATPKHRQT 

RAGWLGLGLSGWPTMHFT1AEGFVKATTVGQ 

MGWFFLMAVMYITGAGLYAARIPERFFPGKFDI 

WFQSHQIFHVLWAAAFVHFYGVSNLQEFRYGL 

EGGCTDDTLL 


3281 , 


A 


1 


557 


RPRRRQPSFSCRVLVLEDPPCFRFTNSMNQEKLA 

KLQAQVRIGGKGTARRKKKVVHRTATADDKKL 

QSSLKKLAVNNIAGIEEVNMIKDDGWIHFNNPK 

VQASLSANTFAITGHAEAKPITEMLPGILSQLGAD 

SLTSLRKLAEQFPRQVLDSKAPKPEDIDEEDDDV 

PDLVENFDEASKNEAN 


3282 


A 


155 


1139 


HALGRRGGSQELSAAACGCFALRLRAPGSGRPA 

LAPGAAAFAGLGGAPRFPPRGSAAGRTMLLKEY 

RICMPLTVDEYKIGQLYMISKHSHEQSDRGEGVE 

WQNEPFEDPHHGNGQFTEKRVYLNSKLPSWAR 

AVVPKIFYVTEKAWNYYPYTITEYTCSFLPKFSIH 

IETKYEDNKGSNDTIFDNEAKDVEREVCFIDIACD 

EIPERYYKESEDPKHFKSEKTGRGQLREGWRDSH 

QPIMCSYKLVTVKFEVWGLQTRVEQFVHKVVR 

DILLIGHRQAFAWVDEWYDMTMDDVREYEKN 

MHEQTNIKVOSIQHSSPVDDTESHAQTST 


3283 


A 


159 


547 


IKSKJLNQQVEVQESEWRLTEAKGPTMGKESGW 
DSGRAAVAAWGGWAVGTVLVALSAMGFTSV 
GIAASSIAAKMMSTAAIANGGGVAAGSLVAILQS 
VGAAGLSVTSKVIGGFAGTALGAWLGSPPSS 


3284 


A 


227 


637 


TSNSLLRPDRMSVMDLANTCSSFQSDLDFCSDCG 

SVLPLPGAQDTVTCIRCGFNINVRDFEGKVVKTS 

WFHQLGTAMPMSVEEGPECQGPVVDRRCPRCG 

HEGMAYHTRQMRSADEGQTVFYTCTNCKFQEK 

EDS 


3285 


A - 


123 


1535 


HRLSYDEAFAMANDPLEGFHEVNLASPTSPDLL 

GVYESGTQEQTTSPSVIYRPHPSALSSVPIQANAL 

DVSELPTQPVYSSPRRLNCAEISSISFHVTDPAPCS 

TSGVTAGLTKLTTRKDNWAEREFLQGATITEAC 

DGSDDIFGLSTDSLSRLRSPSVLEVREKGYERLKE 

ELAKAQRELKLKDEECERLSKVRDQLGQELEEL 

TASLFEEAHKMVREANIKQATAEKQLKEAQGKI 

DVLQAEVAALKTLVLSSSPTSPTQEPLPGGKTPF 

KKGHTRNKSTSSAMSGSHQDLSVIQPIVKDCBCEA 

DLSLYNEFRLWKDEPTMDRTCPFLDKIYQEDFP 

CLTFSKSELASAVLEAVENNTLSIEPVGLQPIRFV 

KAaA VbCUUrKKCALl GQSKbCKHRIKLGDSoN 

YYYISPFCRYRITSVCNFFTYIRYIQQGLVKQQDV 

DQMFWEVMQLRKEMSLAKLGYFKEEL 


3286 


A 


3 


589 


GPSQSMAAGELEGGKPLSGLLNALAQDTFHGYP 
GITEELLRSQLYPEVPPEEFRPFLAKMRGELKSIAS 
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SEQH) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A -Ala nine OCysteine, D^Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, Glycine, H=Histidine, 
I=Isoleucine, K«Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R«Argininc, S=Serinc, 
T«=Thrconine, V=Valine, W=Tryptopban, V=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










ADMDFNQLEAFLTAQTKKQGGITSDQAAVISKF 
WKSHKTKIRESLMNQSRWNSGLRGLSWRVDGK 
SQSRHSAQIHTPVAIIELELGKYGQESEFLCLEFD 
EVKVNQILKTLSEVEESISTLISQPN 


3287 


A 




390 


LGAMAKHHPDLIFCRKQAGVAIGRLCEKCDGKC 
VICDSYVRPCTLVRICDECNYGSYQGRCVICGGP 
GVSDAYYCKECTIQEKDRDGCPKIVNLGSSKTDL 
FYERKKYGFKKR 


3288 


A 

A 


3 


428 


RTTFFRFRPCESLCGDMKLLIHNLLSSHVRGVGS 

RGFPLRLQATEVRICPVEFNPNFVARMIPKVEWS 

AFLEAADNLRLIQVPKGPVEGYEENEEFLRTMH 

HLLLEVEVffiGTLQCPESGRMFPISRGEPNMLLSE 

EETES 


3289 


A 


1 


1743 


AGCCRDTRFPTPRGPGSLCHNFCRSAACTVTRTI 

HGSPREDTGTPRSREMMFQDSVAFEDVAVSFTQ 

EEWALLDPSQKNLYRDVMQETFKNLTSVGKTW 

KVQNIEDEYKNPRRNLSLMREKLCESKESHHCG 

ESFNQIADDMLNRKTLPGITPCESSVCGEVGTGH, 

SSLNTHIRADTGHKSSEYQEYGENPYRNKECKK . 

AFSYLDSFQSHDKACTKEKPYDGKECTETFISHS 

CIQRHRVMHSGDGPYKCKFCGKAFYFLNLCLIH 

ERIHTGVKPYKCKQCGKAFTRSTILPVHERTHTG 

VNADECKECGNAFSFPSEIRRHKRSHTGEKPYEC 

KQCGKVFISFSSIQYHKMTHTGEKPYECKQCGK 

AFRCGSHLQKHGRTHTGEKPYECRQCGKAFRCT 

SDLQRHEKTHTEDKPYGCKQCGKGFRCASQLQI 

HERTHSGEKPHECKECGKVFKYFSSLRIHERTHT 

GEKPHECKQCGICAFRYFSSLHIHERTHTGDKPYE 

CKVCGKAFTCSSSIRYHERTHTGEKPYECKHCGK 

AFISNYIRYHERTHTGEKPYQCKQCGKAFIRASS 

CREHERTHTTNR 


3290 


A 


2 


1350 


GRPRSSSDNRNFLRERAGLSSAAVQTRIGNSAAS 

RRSPAARPPVPAPPALPRGRPGTEGSTSLSAPAVL 

WAVAVVWVVSAVAWAMANYIHVPPGSPEVP 

KLNVTVQDQEEHRCREGALSLLQHLRPHWDPQE 

VTLQLFTDGITNKLIGCYVGNTMEDVVLVRIYGN 

KTELLVDRDEEVKSFRVLQAHGCAPQLYCTFNN 

GLCYEFIQGEALDPKHVCNPAIFRLIARQLAKIHA 

IHAHNGWIPKSNLWLKMGKYFSLIPTGFADEDIN 

KRFLSDffSSQILQEEMTWMKEILSNLGSPVVLCH 

MDLLCK1TOYNEKQGDVQFIDYEYSGYNYLAYDI 

GNHFNEFAGVSDVDYSLYPDRELQSQWLRAYLE 

AYKEFKGFGTEVTEKEVEILFIQVNQFALASHFF 

WGLWALIQAKYSTiEFDFLGYAIVRFNQYFKMK 

PEVTALKVPE 


3291 


A 


102 


839 


PEAQTSAVLAREKGHLPTMRHEAPMQMASAQD 

ARYGQKDSSDQNFDYMFKLLIIGNSSVGKTSFLF 

RYADDSFTSAFVSWGIDFKVKTVFKNEKRIKLQI 

WDTAGQERYRTITTAYYRGAMGFILMYDITNEE 

SFNAVQDWSTQDCTYSWDNAQVILVGNKCDME 

jluijv v lo l iiivvjv^rijjVjiiv^jjOr 

TFERLVDnCDKMSESLETDPAITAAKQNTRLKET 
PPPPQPNCAC 


3292 


A 


2 


4136 


DRPPWNSRVDDFVTNLIHLSSKGHISPAKDTSLQ 
QRTPAEMSPVLHFYVRPSGHEGAASGHTRRKLQ 
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SEQID 
NO: 


Method 


Predicted . 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
•nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutaraic Acid, F=Phcnylalanine, G^GIycine, H=Histidine, 
I=Isoleucine, K^Lysine, L^Leucine, M=Methionine, 
N=Asparagine, PHProline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W«Tryptophan, V=Tyrosine, 
X-Un known, *=Stop codon, /=possible nucleotide deletion, 
^possible nucleotide insertion 










GKLPELQGVETELCYNVNWTAEALPSAEETKKL 
MWLFGCPLLLDDVARESWLLPGSNDLLLEVGPR 
LNFSTPTSTNIVSVCRATGLGPVDRVETTRRYRLS 
FAHPPSAEVEAIALATLHDRMTEQHFPHPIQSFSP 
ESMPEPLNGPINELGEGRLALEKANQELGLALDS 
WDLDFYTKRFQELQKNPSTVEAFDLAQSNSEHS 
. RHWFFKGQLHVDGQICLVHSLFES1MSTQESSNP 
NNVLKFCDNSSAIQGKEVRFLRPEDPTRPSRFQQ 
QQGLRHWFTAETHNFPTGVCPFSGATTGTGGRI 
RDVQCTGRGAHWAGTAGYCFGNLHIPGYNLP 
WEDLSFQYPGNFARPLEVACEASNGASDYGNKF 
GEPVLAGFARSLGLQLPDGQRREWIKPIMFSGGI 
GSMEADHISKEAPEPGMEWKVGGPVYRJGVGG 
GAASSVQVQGDNTSDLDFGAVQRGDPEMEQKM 
NRV1RACVEAPKGNPICSLHDQGAGGNGNVLKE 
LSDPAGAIIYTSRFQLGDPTLNALEIWGAEYQESN 
ALLLRSPNRDFLTHV SARERCPACFVGTITGDRRI . 
VLVDDRECPVRRNGQGDAPPTPPPTPVDLELEW 
VLGKMPRKEFFLQRKPPMLQPLALPPGLSVHQA 
LERVLRLPAVASKRYLTNKVDRSVGGLVAQQQC 
VGPLQTPLADVAWALSHEELIGAATALGEQPV 
KSLLDPKVAARLAVAEALTNLVFALVTDLRDVK 
CSGNWMWAAKLPGEGAALADACEAMVAVMA 
ALGVAVDGGKDSLSMAARVGTETVRAPGSLVIS 
AYAVCPDITATVTPDLKHPEGRGHLLYVALSPG 
QHRLGGTALAQCFSQLGEHPPDLDLPENLVRAFS 
ITQGLLKDRLLCSGHDVSDGGLVTCLLEMAFAG 
NCGLQ VDVPVPRVDVLSVLFAEEPGLVLEVQEP . 
DLAQVLKRYRDAGLHCLELGHTGEAGPHAMVR 
VSVNGAWLEEPVGELRALWEETSFQLDRLQAE 
PRCVAEEERGLRERMGPSYCLPPTFPKASVPREP 
GGPSPRVAILREEGSNGDREMADAFHLAGFEVW 
DVTMQDLCSGAIGLDTFRGVAFVGGFSYADVLG 
SAKGWAAAVTFHPRAGAELRRFRKRPDTFSLGV 
CNGCQLLALLGWVGGDPNEDAAEMGPDSQPAR 
PGLLLRHNLSGRYESRWASVRVGPGPALMLRG : 
MEGAVLPVWSAHGEGYVAFSSPELQAQEEARGL 
APLHWADDDGNPTEQYPLNPNGSPGGVAGICSC 
DGRHLAVMPHPERAVRPWQWAWRPPPFDTLTT 
SPWLQLFINARNWTLEGSC 


3293 


A . 


65 


642 


GVRGFWAGTMASRAGPRAAGTDGSDFQHRERV 

AMHYQMSVTLKYEIKKLIYVHLVIWLLLVAKMS 

VGHLRLLSHDQVAMPYQWEYPYLLSELPSLLGLL 

SFPRNNISYLVLSMISMGLFS1APLIYGSMEMFPA 

AQQLYRHGKAYRFLFGFSAVSIMYLVLVLAVQV 

HAWQLYYSKKLLDSWFTST.QEKKHK 


3294 


A 


35 


1821 


SQRSCPRSPSSPAPPWARCSNPDSRTGGVPVPRA 

WSAGGPALGLMAAPVRLGRKRPLPACPNPLFVR 

WLTEWRDEATRSRHRTRFVFQKALRSLRRYPLP 

LRSGKEAKILQHFGDGLCRMLDERLQRHRTSGG 

DHAPDSPSGENSPAPQGRLAEVQDSSMPVPAQP 

KAGGSGSYWPARHSGARVILLVLYREHLNPNGH 

HFLTKEELLQRCAQKSPRVAPGSARPWPALRSLL 

HRNLVLRTHQPARYSLTTEGLELAQKLAESEGLS 

LLNVGIGPKEPPGEETAVPGAASAELASEAGVQQ 
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seq n> 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C^Cysteine, D=Aspartic Acid, 
E^Glutamic Acid, F=Phenylalanine, G^Glycine, H=Histidinc, 
I-Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N^Asparagine, P=Proline, Q^Glutamine, R=Arginine ^Serine, 
T=Threonine, V=Valine,\V=Tryptophan, V=Tyrosine, 
X=Unknown, *=Stop cod on, /-possible nucleotide deletion, 
\=possible nucleotide insertion 










QPLELRPGEYRVLLCVD1GETRGGGHRPELLREL 

QRLHVTHTVRKLHVGDFVWVAQETNPRDPANP 

GELVLDHIVERKRLDDLCSSIIDGRFREQKFRLKR 

CGLERRVYLVEEHGSVHNLSLPESTLLQAVTNTQ 

VlDGFr VKK 1 ADLKJ&SAA Y LALL 1 KUEvKJLY QQH 

TLRSRPWGTPGNPESGAMTSPNPLCSLLTFSDFN 

. AGAIKNKAQSVREVFARQLMQVRGVSGEKAAA 

LVDRYSTPASLLAAYDACATPKEQETLLSTIKCG 

RLQRNLGPALSRTLSQLYCSYGPLT 


3295 


A 


2 


1115 


EFHPHTQVSGLLTPQLQEPDVWSPSRGQPVSLHL 

PGKGAPEVKEMAWWKSW1EQEGVTVKSSSHFN 

PDPDAETLYKAMKGIGTNEQAIIDVLTKRSNTQR 

QQIAKSFKAQFGKDLTETLKSELSGKFERLIVAL 

MYPPYRYEAKELHDAMKGLGTKEGVnEILASRT . 

KNQLREIMKAYEEDYGSSLEEDIQADTSGYLERI 

T "\ 7/~iT T AO CT)ln*T\T T C PPT 7T"\Ti AT AT f~\T\ A f~\T~\T "V 7" A A /~» T? 

LVCLLQG SRDD VS SFVDP ALALQD AQDLYAAGE 

KIRGTDEMKJ^TILCTRSATHLLRWEEYEKIANK 

SIEDSIKSETHGSLEEAMLTVVKCTQNLHSYFAE 

RLYYAMKGAGTRDGTLIRNIVSRSE1DLNLIKCH 

FKKMYGKTLSSMIMEDTSGDYKNALLSLVGSDP 


3296 


A 


1 


838 


GTRGGV GPGDNGG VEAGAKPGAAAEPLRGDGS 
GETGPGRVAPGEVRGSPRGHVAGPEGPREVLFFF 
FLPSSKPASEVINEYSWKVDFLKGMLQAEKLTSS 
SEKALANQFLAPGRVPTTARERVPATKTVHLQS 

Tl A TJ"\7nPCT?"K ATI PT1 T /**» 'IT \ O A T -, T»~C < > jTTTW 7T» VD *TV ,, "1 7 A O 

RARYTSEMRSELLGTDS AEPEMD VRKKTG V AG S 
QPVSEKQSAAELDLVLQRHQNLQEKLAEEMLGL 
ARSLKTNTLAAQSVIKKDNQTLSHSLKMADQNL 
EKLKTESERLEQHTQKSVNWLLWAMLIIVCFIFIS 
MILFIRIMPKLK 


3297 


A 


46 


617 


HKQPAGFLGLWLGTETYTISFPGPETFGLGLSHA 

TuIPG SPACKQP V VGLH bLHN YRMAM V 5>AMSW 

VLYLWISAGAMLLCHGSLQHTFQQHHLHRPEGG 

TCEV1AAHRCCNKNRJEERSQTVKCSCLPGKVAG 

TTRNRPSCVDASIVIGKWWCEMEPCLEGEECKTL 

PDNSGWMCATGNKIKTTRIHPRT. 


3298 


A 


157 


748 


IQPPDPRNMTLAAYKEKMKELPLVSLFCSCFLAD 

PLNKSbYKYEADTVDLN Wt VISDMEVlbLlNKC I 

SGQSFEVILKPPSFDGVPEFNASLPRRRDPSLEEIQ 

KICLEAAEERRKYQEAELLKHLAEKREHEREVIQ 

KAIEENNNFIKMAI<^KLAQKMESN^ 

AMLERLQEKDKHAEEVRKNKELKEEASR 


3299 


A 


5 


892 


TQLPAPLSGVLSRLQLGSGAPLLTWVQETAGVA 

GGAPRRRTPVTMWRLLARASAPLLRVPLSDSWA 

LLPASAGVKTLLPVPSFEDVSPEKPKLRFIERAPL 

VPKVRREPKNLSDIRGPSTEATEFTEGNFAILALG 

yjyj Y LH W u ixr Jb MMJvL J 1IN Kb MJL/r jtuVMrAI WKVr 

APFKPITRKSVGHRMGGGKGAIDHYVTPVKAGR 

LWEMGGRCEFEE VQGFLDQVAHKLPFAAKAVS 

RGTLEKMRKDQEERERNNQNPWTFERIATANML 

GXRKVLSPYDLTHKGKYWGKFYMPKRV 




A 






FVAnnPRn^nQA aftmpptrvtpt nAfinnvnRQ 
CILVSIAGKNVMLDCGMHMGFNDDRRFPDFSYI 
TQNGRLTDFLDCVnSHFHLDHCGALPYFSEMVG 
YDGPIYMTHPTQAICPILLEDYTIJ^VDKKGEAN 
FFTSQMIKDCMKKVVAVHLHQTVQVDDELEIKA 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid- residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanme OCystcinc, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, OGIycine, H^Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, M=Mcthionine, 
N=Asparagine, P^Proline, Q=Glutamfne, R=Arginine, S=Serine, 
T<=Threonine, V-Valine, W«Tryptophan, Y=Tyrosine, 
X=Unknown, *«Stop codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion 






> 




YYAGrT^GAAMFQIKVGSESVVYTGDYNMTPD 

RHLGAAWIDKCRPNLLITESTYATTIRDSKRCRE 

RDFLKKVHETVERGGKVLIPVFALGRAQELCILL 

ETFWERMNLKWIYFSTGLTEKANHYYKLFIPWT 

NQKlRKTr^QimWEFKHIKAFDRAFADNPGPM 

VWATPGMLHAGQSLQERRKWAGNEKNMVIMP 

GYCVQGTVGHK1LSGQRKLEMEGRQVLEVKMQ 

VEYMSFSAHADAKGIMQLVGQAEPESVLLVHGE 

AKKMEFLKQKIEQELRVNCYMPANGETVTLPTS 

PSEPVGISLGLLKREMAQGLLPE AKKPRLLHGTLI 

MKDSNFRLVSSEQALKELGLAEHQLRFTCRVHL 

HDTRKEQETALRVYSHLKS VLKDHC VQHLPDG S 

VTVESVLLQAAAPSEDPGTKVLLVSWTYQDEEL 

GSFLTSLLKKGLPQAPS 


3301 


A 


2 


349 


CIRTEPAAAFRRLGALSGAAALGFASYGAHGAQ 
FPDAYGKELFDKANKHHFLHSLALLGVPHCRKP 
LWAGLLLASGTTLFCTSFYYQALSGDPSIQTLAP 
AGGTLLLLGWLALAL 


3302 


A 


59 


1184 


LRRNCSALGGLFQTnSDMKGSYPVWEDFINKAG 

KLQSQLRTTVVAAAAFLDAFQKVADMATNTRG 

GTREIGSALTRMCMRHRSIEAKLRQFSSALIDCLI 

NPLQEQMEEWKKVANQLDKDHAKEYKKARQEI 

KKK5SDTLKLQKKAKKGRGDIQPQLDSALQDVN 

DKYLLLEETEKQAVRKALIEERGRFCTFISMLRP 

VIEEEISMLGEITHLQTTSEDLKSLTMDPHKLPSSS 

EQVILDLKGSDYSWSYQTPPSSPSTTMSRKSSVC 

SSLNSVNSSDSRSSGSHSHSPSSHYRYRSSNLAQQ 

APVRLSSVSSHDSGFISQDAFQSKSPSPMPPEAPN 

QRRKEKREPDPNGGGPTTASGPPAAAEEAQRPRS 

M 


3303 


A 


511 


958 


AGRGGPGKPVSWSSGPGSPGQTQRRSWVKSTRG 
HSSLLPPSQDFVAGLSVILRGTVDDRLNWAFNLY 
DLNKDGCITKEEMLDIMKSIYDMMGKYTYPALR 
EEAPREHVESFFQKMDRNKDGWTIEEFIESCQK 
DENIMRSMQLFDNVI 


3304 


A 


40 


432 


ISEAASGAFQAR*FYQM\LEQKTDALGKQSVNRG 
FIXDKTLSSIFNIEMVKEKTAEEIKQIWQQYFAA 
KDTVYAVIPAEKFDLIWNRAQSCPTFLCALPRRE 
GYEFFVGQWTGTELHFHCTYKYSDPEGKA 


3305 


A 


2 


483 


LDACSTGPYSRSTHASADAWADAWVVVVLKVV 
GMTLFLLYFPQIFN1CSNDGITTTRSYGTVSQIFGS 
RSPSPNGFITTRSYGTVCPKDWEFYQARCFFLIHL 
*\SSWNESWDFCKGKGCTLAIVDNSETLKLLHDL 
HDAEKNYIALPYRSSKYMSTCNGTF 


3306 


A 


2 


872 


TLSSACLIGDAWKELTIVAGAVSNQLLVWYPAT 

ALADNKPVAPDRRJSGHVGIIFSMSYLESKGLLA 

TASEDRSVRIWKGGDLRVPGGRVQNIGHCFGHS 

ARVWQVKLLENYLISAGEDCVCLVWSHEGEILQ 

AFRGHQGRGIRAIAAHERQAWVITGGDDSGIRL 

WHLVGRG YRGLG/DLGSLLQVP* *ARYTQGCDS 

GWLLA I AuSD*YRGPVSL*RRGQVLGAAARG*T 

FPVLLPAGGSSWSRGLRTVCYGQWGRSCQGCPH 

QHSNCCCGPDPVSWEGAQLELGPAWL 


3307 


A 


2 


927 


RTSRVEKGLRKAGAAVTMESDEWFSQALPANTS 
AQKAELLALTQAIRWGKDINVNTDSRYAFATVH 
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SEQ ID 
NO: 


Method 


Predicted 
. beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhcnylaIanine, G-Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L»Leucine, M-Methioninc, 
N-Asparagine, P»=Pro!ine, Q=Glutamine, R=Arginine, S=Serine, 
•^Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=l)nknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion -> 










WGAICQERRLLTSAEKAIKNKNPPSSO^ 

WGTTCDQVNAKQGPKPSPGHRLRRNLPGEKWEI 

DFTKVKPHQAGYKYLLVLVDTFSGWTEAFATK 

NETVNMVVKFLLNEIIPRHGLPVAIGSDNGPAFA 

LSIV*SVSKALOTQWKLHCAYRPQSSGQVERMNC 

,TLKNTLTKLILETGVNWVSLLPLALLRVRCTPYW 

AGH.PFEIMYGRVLPILPKLRDAQLAKISQTNLLQ 

YLQSP 


3308 


A 


490 


1077 


NSPSLDFNDNEDIPTELSDSSDTTtDEGEVQAFYE 

DLSGRQYVNEVFNFSVDKLYDLLFTNTSPFQRDF 

MEQRRFSDIIFHPWKKEENGNQSRVIPYTITLTNP 

LEHKTAWRETQTMYKASQESECYVEDAEVLTH 

DVPYHDYFYTI1SFRYTLTRVARNKSRLRVSTELRY 

RKQPWGLVKTFIEKNFWSGLEDYFRHL 


3309 


A 


490 


1077 


NSPSLDFNDNEDIPTELSDSSDTHDEGEVQAFYE 

DLSGRQYVNEVFNFSVDICLYDLLFTNSPFQRDF 

MEQRRFSDIIFHPWKKEENGNQSRVIPYTl'TLTNP 

LEHKTATVRETQTMYKASQESECYVEDAEVLTH 

DVPYHDYFYTINRYTLTRVARNKSRLRVSTELRY 

RKQPWGLVKTFIEKNFWSGLEDYFRHL 


3,310 

. . /' 


A 


2 


1198 , 


SPLCHPGLSRER/S*SEAKLRSGRYC*KRQVEAPL 

*RPGL*TN4AASDTERDGLAPEKTSPDRDKKKEQS 

EVSVSPRASKHHYSRSRSRSRERKRKSDNEGRKH 

RSRSRSKEGRRHES10)KSSKXHKSEEHNDKEHSS 

DKGRERLNSSENGEDRHKRKERKSSRGRSHSRS 

RSRERRHRSRSRERKKSRSRSRERKKSRSRSRER 

KKSRSRSRERKRRIRSRSRSRSRHRHRTRSRSRTR 

SRSRDRKXRIEKPRRFSRSLSRTPSPPPFRGRNTA 

MDAQEALARRLERAKKLQEQREKEMVEKQKQQ 

EIAAAAAATGGSVLNVAALLASGTQVTPQIAMA 

AQMAALQAKALAETG1AVPSYYNPAAVNPMKF 

AEQEKJKRKMLWQGKICEGDKSQSAGNMGKN 


3311 


A 


177 


4 


PIQIPPRITPPRPSPHLLTPRTGSSPPPPRAPSPPHPT 
PGPAHDFPPLSAVLSGHTKT 




A 

A 


3 


426 


LESPRH*PPCWGPLIWALTVSSVPSPTPELSCILKS 
P/RPACPV/PGLWPSLLSPAPPQSSGPLLGLSPCPG 
AGQWPSPLSPAPPPSSDPLSGLSPCPGAGPRSSPVS 
ASAPCRAVPLSPRRLTWPPHLQVGILIPTGRPWK 

XTT 

NL 


3313 . 


A 


162 


2 


QLQNLASRGCL*SQLLRRLRRENRLNPGGGGCSE 
IAP\CTPAWVTQRDFFRKKK 


3314 


A 


162 


2 


QLQNLASRGCL* SQLLRRLRRENRLNPGGGGCSE 
IAP\CTPAWVTQRDFFRKKK 


13 i < 

JJ ID 


A 


46o 


1 


PRKRESWWGERLP/PRGFPPAAEDAPAPGWKGR 
KHASRTARAHVFHPIRQSrRSPVRGRPGDPRAAH 
TRSAGTRLQCKASRGG*GKGPAPTR*EGGPGSAP 
APLPASSGCSLFPDSSPWTPPPPAPGAAAAQP* *T 
PRCPAALRAGAfflGRVGRPY 


3316 


A 


3 


2307 


NHLGTLMQNWDSSSRVPFSSGQHSTQSFPPSLMS 
. KSNSMLQKPT\AYVRPMDGQESMEPKLSSEHYSS 

OQMnisJ^'M'TPT TTPQQlir A ITT tvt ftdohdt v\ac a or^ 
v^orivjJNoivi i JCijjsXoois.ArlL 1 JhJjisJUro^rJLJJAoAovj 

DVSCVDEELKEMTHSWPPPLTAIHTPCKTEPSKFP 

FPTKESQQSNFGTGEQKRYNPSKTSNGHQSKSM 

LKDDLKLSSSEDSDGEQDCDKTMPRSTPGSNSEP 

SHHNSEGADNSRDDSSSHSGSESSSGSDSESESSS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E==Glutamic Acid, F=Phenylalantne, G=Glycine, H=Htstidine, 
I^Isoleucine, K^Lysine, L=Leucine, M=Methlontne, 
N-Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T«Thrconinc, V= Valine, W=Tryptophan, Y=Tyrosine, 
X=Unltnown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










SDSEANEPSQSASPEPEPPPTNKWQLDNWLNKV 

NPHKVSPASSVDSNPSSQGYKKEGREQGTGNSY 

TDTSGPKETSSATPGRVAPKPIQKGSESGRGRQKS 

PAQSDSTTQRRTVGKKQPKKAEKAAAEEPRGGL 

KIESETPVDLASSMPSSRHKAATKGSRKPNIKKES 

KSSPRPTAEKKKYKSTSKSSQKSREIIETDTSSSDS 

DESESLPPSSQTPKYPESNRTPVKPSSVEEEDSFFR 

QRMFSPMEEKELLSPLSEPDDRYPLIVKIDLNLLT 

RJPGKP YKETEPPKGEKXNVPEKHTREAQKQASE 

KVSNKGKRKHKNEDDNRASESKKPKTEDKNSA 

GHKPSSNRESSKQSAAKEKDLLPSPAGPVPSKDP 

KTEHGSRKRTISQSSSLKSSSNSNKETSGSSKNSS 

STSKQKK-nEGKTSSSSKEVKVKAPSSSSNCPPSAP 

TLDSSKPRRTKLVFDDRNYSADHYLQEAKKLKH 

NADALSDRFEKAVYYLDAVVSFIECGNALEKNA 

QESKSPFPMYSETVDLI 


3317 


A 


496 


2 


NLLQDEKLVHSYPYDWRTQETCGYIVPARQWFI 
N\TRDIKTAAKELLKKVKFIPG S ALNGMVEMMD 
RRPYWCISRQRVWGVPIPVFHHKTKDEYLINSQT 
TEHIVKLVEQHGSDrWWTLPPEQLLPKEVLSEVG 
GPDALEYVPGQDILDIWFDSGTSWSYVLPGPD 


3318 


A . 


2 


512 . 


AWHEGDSRSDQCHHPYNYGFDYYYGMPFTLVD 

SCWPDPSRNTELAFESQLWLCVQLVAL^ILTLTF 

GKLSGWVSVPWLLIFSMILFIFLLGYAWFSSHTSP 

LYWDCLLMRGHEITEQPMKAEXRAGSIMVKEAIF 

LFRKGHSKGKLFLLFFLPFLQVHKTFPTTDGFHW 

AP 


3319 


A . ' 


407 


1 


SSLHRSPRPASPLPVPEAP\SFLPVPAPKPSALPPFS 
LSGAPSSASTFSPHSSPSPASPTPAPSPQSPFPSRPT 
SPPSLTPTRRPPLPADRRGPHLLYQPLHAPLEAAA 
TGPE/PSAAAGRLPRPRPPWRAAYPASR 


3320 


A 


4037 


3432 


QMSEAVAEKMLQYRRDTAGWKICREGNGVSVS 

WRPSVEFPGNLYRGEGIVYGTLEEVWDCVKPAV 

GGLRVKWDENVTGFEIIQSITDTLCVSRTSTPSAA 

MKLISPRDFVDLVLVKRYEDGTISSNATHVEHPL 

CPPKPGFVRGFNHPCGCFCEPLPGEPTKTNLVTFF . 

HTDLSGYLPQNVVDSFFPRSMTRFYANLQKAVK 


3321 


A- 


37 


360 


SHSASGAGRPAAPAADLRPAPNGQRPGPRLGAR 
ALWLPPRGRPDEAGRLPGEHLPQVPWDPGLTRS 
PSPRGPCRGAARAGHVGETPAPWGCPPPCAWEH 
KGPGSEGTP 


3322 


A 


1 


420 


AIVEDKHSGRSYDITSDLGNVLTSTS1AKTVNG*A 

ESSDSGAESDEEDAQEDLMGAYHSDIDKKMMKI 

VADHKNLEVIVTNGYDKDGFVHDIQNDIHASSSL 

NGRSTVHVKPIDENLGQTGKSAVCIHQDINDDH 

VEDVT 


3323 


A 


8 


459 . 


DTLSLNCTLPETLPMTPSF*LSFL*FPGLARAKSIP 

TKTYSNEVVTLWYRPPDELLGSTDYSTQIDMW*G 

QVEVWQGPCGKGGGLVTTATQPAAFLFTVPSLP 

RGVGCIFYEMATGRPLFPGSTVEEQLHFIFRILSE 

EAWALCAVETHR 


3324 


A 


1276 


466 


PGSTHASARITIY*L*IILSNATEVDNNFSKPPPFFP 
AGAPPASSSSSSSSSSPPTVSTAPPLIPPPGFPPPPG 
APPPSLIPTIESGHSSGYDSRSARAFPYGNVAFPH 
LPGSAPSWPSLVDTSKQWDYYARSSSSSSSSSSSS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine 0*Cysteine, D=Aspartic Acid, 
E<=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidinc, 
Msoleucine, K-Lysine, L=*Leucine, M<=Methionine, 
N-Asparagine, P-Proline, Q=Glutamine, R=Argininc, S^Serine, 
T-Threonine, V^Valine, W=Tryptophan, Y=Tyrosinc, 
X«Unknown, *=Stop codon,/=possible nucleotide deletion, 
^possible nucleotide insertion 










SSSPRDRDRER*RTRERERERDHSPTPSVFNSDEE 
RYRYREYAERGYERHRASREKEERHRERRHREK 
EETRHKSSRSNSRRRHESEEGDSHRRHKHKKSKR 
SKEGKEAGSEPAPEQESTEATPAE 


3325 


A 


266 


3312 


~ TCLFSASCSSLPSPSSSFALLSTENTQRTYRVNPD 
GSLRVTFASGMEIGLSSEPHILAGAVNPTLGKCNI 
SLPGEHNANLISVL**GEQGCA*NVFHISFS*AHN 
RNLLSIDFDHITRTGKIYDDHRKFTLR1LYDQTGR 
PILWSPVSRYNEVNITYSPSGLVTFIQRGTWNEK 
MEYDQSFL*SPQL*LSIICYSAFVSFQSVMLLLHS 
QRRYIFEYDQPDCLLSVTMPSMVRHSLQTMLSV 
GYYRNIYTPPDSSTSFIQDYSRDGRLLQTLHLGTG 
RRVLYKYTKQARLSEVLYDTTQVTLTYEESSGD 
LSDSSTLIA*LLTVFVLVPAGPLIGRQIFRFSEEGL 
WARFDYSYNNFRVTSMQAVINETPLP1DLYRYV 
DVSGRTEQFGKFSVINYDLNQVITTTVMKHTKIF 
SANGQVIEVQYEILKAIAYWMTIQYDNVGRMVI • 
CDIRVGVDANITRYFYEYDADGQLQTVSVNDKT 
QWRYSYDLNGNINLLSHGKSARLTPLRYDLRDRI 
TRLGEIQ YKMDEDGFLRQRGNDIFEYNSNGLLQ 
KAYNKASGWTVQYYYDGLGRRVASKSSLGQHL 
QFFYADLTNPIRVTHLYNHTSSEITSLYYDLQGH 
LIAMELSSGEEYYVACDNTGTPLAVFSSRGQVIK 
ED.YTPYGDIYHDTYPDFQVIIGFHGGLYDFLTKL 
VHLGQRDYDVVAGRWTTPNHHIWKQLNLLPKP 
FNLSTKLIKYGIFHFLFLILCLTDIRSWLELFGFQL 
HNVLPGFPKPELENSPSI*QMSNSMLHLLCASLS* 
TILGIQCELQKQLRNFISLDQLPMTPRYNDGRCLE 
GGKQPRFAAVPSVFGKGIKFAIKDGIVTADIIGVA 
NEDSRRLAAILNNAHYLENLHFTffiGRDTHYFIK 
LGSLEEDLVLIGNTGGRRILENGVNVTVSQMTS V 
LNGRTRRFADIQLQHGALCFNIRYGTTVEEEKNH 
VLEIARQRAVAQAWTKEQRRLQEGEEGIRAWTE 
GEKQQLLSTGRVQGYDGYFVLSVEQ 


3326 


A 


290 


1041 

.- 


kaclhllssfltsnflfnpllpdslysvearsqra 

nlgpcrrkrlqtlmrlaagfqysshkdpslsak 

ekhtdyhneargpwpgwvg*r;tadgscgrgpd 

gahhpgpkssswrasrllpglggshhldayvgr 

dlecgtpaplqleippqprghpapiptgqagprds 

gpgasp*vetrpltdgrr*pgvrpvgwtpahpag 

tlrprgavepsvsacgkwapsptsqgccegrcd 

avpkhrawrtplcsq 


3327 


A 


1 


418 


CSECGKSFCKKSKFIIHQRTHTGEKPYECNQCGK 

SFCQKGTLTVHQRTHTGEKPYECNECGKNFYQK 

LHLIQHQRTHSGEKPYECSYCGKSFCQKTHLTQH 

QRTHSGERPYVCHDCGKTFSQKSALNDHQKIHT 

GVKLY 


3328 


A 


1 


270 


VTRKLPIFrVDAFTARAFRGSPAADCLLENELDED 
MHQKIAREMNLSETAFIRKLHPTDNFAQRSCFGL 
IWFTPTTDLQILTSSILPSIL 




A 


A C 

45 


A 1 fV 

4iy 


EELSCWQIWQQIANDLTRCQDSMrNNSQCHKCJG 
DFPYQVGTELSIQISEDENYIVNKADGPNNTGNP 
EFPILRTQDSWRKTFLTESQRLNRDQQISIKNKLC 
QCKKGVDPIGWISHHDGHRVHKR 


3330 


A 


64 


430 


FWRI^JFTGLAPAAAVATTTSSS'TNIRFTSISNSLTST 
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SEQID 
NO: . 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A= Ala nine OCysteine, D-Aspartic Acid, 
E=Glutamic Acid, F=PhcnyIaIanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K-Lysine, L=Leucine, M-Metbionine, 
N=*Asparagine, P^Proline, Q^Glutamine, R=Arginine, S=Serine, 
T=Threonine, V«=Valine, W=Tryptophan, Y«=Tyrosine, 
X=Unknown, *=Stop codon, £=possible nucleotide deletion, 
V=possible nucleotide insertion 










AAIGLSFTTSTTTTATFTTNTTTTITSGFTVNQNQ 
LLSRGFENLVPYTSTVSVVTTPVMTYGHLEGLIN 
EGNLELEIKRRLSSQATQ 


3331 


A 


3 


407 


TFGCSCTDCFFQKCCPAEAGVLLAYNKNQQIK1P 
PGTPIYECNSRCQCGPDCPNRIVQKGTQYSLCIFR 
TSNGRGWGVKTLVKIKRMSFVMEYVGEVITSEE 
AERRG QFYDNKG ITYLFDLD YE SDEFTVD AA R Y 


3332 


A 


25 


461 


PAADFVLQARPTRADILGIHSKYDEVRKAGACFY 

KMTGLGPGPQALYNGEPFKHEEMNIKELKMAVL 

QRMMDASVYLQREVFLGTLNDRTNAIDFLMDR 

NNVVPRINTLILRTNQQYLNLLSTSVTADAEDFS 

TFFFLDSQDKSA 


3333 


A 


317 


54 


AWIIFLPPLTSCPLWAPGTKHKTELEARSGLGPIK 
AYPRLGPPTPGEPEAPAQDRTFHCEICNVKVNSK 
VQLKQHISSRRHEIVDPV 


3334 


A 


304 ' 


410 


AGPSLPSNLRQIFQSLPPFMDELLLLLFFMIIFAI 


3335 


A 


19 


418 


VESRNSRVQPRVRLNDRTNAIDFLMDRNNVVPRI 
NTLILRTNQQYLNLISTSVTADVEDFSTFFFLDSQ 
DKSAVIAKNMYYLTQDDESnSAATLWIIADFDK 
PSGRKLLFNALKHMITSVHSRVGI1YNPFF 


3336 


A 


1 


1003 


PSS YS SDEL SPGEPLTSPP WAPLGAPERPEHLLNR 

A^ERLAGGATRDSAASDILLDDIVLTHSLFLPTEK 

FLQELHQYFVRAGGMEGPEGLGRKQACLAMLL 

HFLDTYQGLLQEEEGAGHUKDLYLLIMKDESLY 

QGLREDTLRLHQLVETVELKIPEENQPPSKQVKP 

LFRHFRRIDSCLQTRVAFRGSDEIFCRVYMPDHS 

YVTIRSRLSASVQDILGSVTEKLQYSEEPAGREDS 

LILVAVSSSGEKVLLQPTEDCVFTALGINSHLFAC 

TRDS YEALVPLPEEIQVSPGDTEIHRVEPED V ANH 

LTAFHWELFRCVHELEFVDYVFHGE 


3337 


A 


444 


43 


KILLCLANQFPDISFCPALPAVVALLLHYSIDEAE 
CFEKACRILACNDPGRRLIDQSFLAFESSCMTFGD 
LVNKYCQAAHKLNTVAVSEDVLQVYADWQRWL 
FGELPLCYFARVFDVFLVEGYKVLYRVALAXXF 


3338 . 


A 


1 


398 


FRGKVRGRSAEMPGSDTALTVDRTYSDPGRHHR 
CKSRVERHDMNTLSLPLNIRRGGSDTNLNFDVPD 
GILDFHKVKLTADSLKQKILKVTEQIKIEQTSRDG 
NVAEYLKLVNNADKQQAGRIKQVFEKKNQK 


3339 


A 


1 


665 


AAAASNWGLITNIVNSIVG VS VLTMPFCFKQCG1 . 

VLGALLLVFCSWMTHQSCMFLVKSASLSKRRTY 

AGLAPHAYGKAGKMLVETSMIGLMLGTCIAFYV 

VIGDLGSOTFARLFGFQVGGTFRMFLLFAVSLCI 

VLPLSLQRNMMASIQSFSAMALLFYTVFMFVIVL 

SSLKHGLFSGQWLRRVSYVRWEGVFRCIPIFGMS 

FACQSQVLPTYDSLDEPSV 


3340 


A 


198 


367 


LLPLQVLQEAFSRC VA VLTRS SKPSDMSVQ VCG 
YISKCYSVAAQFEECREKITEMP 


3341 


A 


562 


277 


HSVIKRTPRKYLAEIVLIDDFSNKEHLKEKLDEYI 
KLWNGLVKVFRNERREGLIQARSIGAQKAKLGQ 
VLIYLDAHCEVA VNWYAPLVAPISKDR 


3342 


A 


385 


2 


NLTWWPLFRDVSFYIVDLIMLIIFFLDNVIMWWE 
SLLLLTAYFC YVWMKFNVQVEK WVKQMTNRN 
KWKVTAPEAQAKPSAARDKDEPTLPAKPRLQR 
GGSSASLHNSLMKNSWQNKmTLD?HV 


3343 


A 


1 


385 


FRVDNSEEWKDVFnSSERSFKLDSLKCGTWYKV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
. nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystcinc, D=Aspartic Acid, 
£>=Glutamic Acid, FHPhenylalanine, G=Glycine, H=Hlstidine, 
I=Isoleucine, K==Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Pro!ine, Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V«Valine, W=Tryptophan, Y=Tyrosine, 
X«Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










isJL/AAJsJNb V uouKlbJbJibAK 1 HOlvbPSFSKDQHLF 
THINSTHARLNLQGWNNGGCPITAIVLEYRPKGT 

WAwnnT p AXTCcn*c\rcT tdt T>"C A*T\irv 
w/v w^UJLKArNboUcvrJLlJbLKJbA 1 W Y 


3344 


A 


351 


147 


SPACITSSLSQHIADPPJ^APTEVKVRVMNSTAISL 
QWNRVYSDTVQGQLREYRVRKPAPDSPNYPAH 


3345 


A 


351 


147 


SPACITSSLSQHIADPRAAPTEVKVRVMNSTAISL 
QWNRVYSDTVQGQLREYRVRKPAPDSPNYPAH 


3346 


A 


3 


1509 


AG1RHEAPPTTSNRHRRQIDRGVTHLN1SGLKMP 

RGIAIDWVAGNVYWTDSGRDVEEVAQMKGENR 

KTLISGMIDEPHAIVVDPLRGTMyWSDWGNHPK f 

IETAAMDGTLRETLVQDNIQWPTGLAVDYHNER 

LYWADAKLSVIGSIRLNGTDPIVAADSKRGLSHP 

FSIDVFEDYIYGVTYI>WRVFKIHKFGHSPLVNLT 

GGLSHASDVVLYHQHKQPEVTNPCDRKKCEWL 

CLLSPSGPVCTCPNGKRLDNGTCVPVPSPTPPPD 

APRPGTCNLQCFNGGSCFLNARRQPKCRCQPRY 

TGDKCELDQCWEHCRNGGTCAASPSGMPTCRCP 

TGFTGPKCTQQVCAGYCANNSTCTVNQGNQPQ 

CRCLPGFLGDRCQYRQCSGYCENFGTCQMAAD 

GSRQCRCTAYFEGSRCEVNKCSRCLEGACVVNK 

QSGDVTCNCTDGRVAPSCLTCVGHCSNGGSCTM 

N SKMMPECQCPPHMTGPRCEEHVFSQQQPGHIA 

SILIP 


3347 


A 


974 


666 


SPEMESHPITQAGVQWHHLSSLQPLPPGFK*FSCF 
SLPE*LGYRHVPPCLANSVFSVEMG\FLHVGQAG 
LELLTSGDLPALASQSAGITGNSHRARPENGFENIF 


3348 


A 


1 


1171 


LSK1TMPVICNEPLSFIQRLTEYM*HTYFIHRPSSL 

SDPVDRMQCVAAFAVSAVASQWERTGKPFNPLL 

GETYELVRDDLGFRLISEQVSHHPPISAFHAEGLN 

NDFIFHGSIYPKLKFWGKSVEAEPKGTITLELLEH 

NEAYTWTNPTCCVHNIIVGKLWIEQYGNVEIINH 

KTGDKCVLNFKPCGLFGKELHKVEGYIQDKSKK 

KLCALYGK\\nTECLYSVDPATFDAYKKNDKKNT 

EEKKN SKQMSTSEELDEMPVPDSES VFIIPG SVLL 

WRIAPRPPNSAQMYNFTSFAMVLNEVDKDMESV 

IPKTDCRLRPDIRAMENGEIDQASEEKKRLEEKQ ' 

KAAKi^KblvSEEDWKTRWFH 

WTYSGSYWDRNYFNLPDIY 


3349 


A 


403 


497 


NFASSSGKYLRTQKIKCLNNKJFTPFPTTEKK*SQS 
VRPP*SNRIY*ILQS*NISFS*LPN*NFASSSGKYLR 
TQKIKGLNNKFTPFPTTEKK 


3350 


A 


1 


712 


GAPAQDCICLPFPFHSSFLESDniKPARRKIQTTNP 

DFLLLLFMSVPVVSAPPFCPPAEGSRDGRPKASV 

ARPAA VHEHHSPRD CGHLPDVIRSSLGG WQPH* P 

AQPENRLL*LLPVE*GHQHPTVSPVP*AGSPGGAS 

GWPGPGQAWRVRVPGPHPLCPPASPPSPVQQ**E 

SVAAGSGLPGCVLCAAGRRPGPLPLLCVEVGQA 

LPPGAWVSSSGQRPGLTHPLAYSHGCVPSEG 




A 

A 


i 




MAAVVAATALKGRGARNARVLRGILAGATANK 
ASHNRTRALQSHSSPEGKEEPEPLSPELEYIPRKR 

OKTsTPMKAVfrT AWATfrFPrniT T FTT TYPPVnifriP 

xjrviNi ivjLI\_r\ v vjl_//A. w " |tTr ~* ■ 1 - r 11 - i fx. fx n. V JL-J JSJL/X\. 

VKQMKARQNMRLSNTGEYESQRFRASSQSAPSP 
DVGSGVQT 


3352 


A 


2 


841 


RTLFRGRRRREDDRISRPHPSTAESKAPTPKFDLL 
ASNFPPLPGSSSRMPGELVLENRMSDVVKGVYK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end. 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
. peptide 
sequence - 


Amino acid sequence (A^Alanine C=Cysteine, D=Aspartic Acid, 
E==Glutamic Add, ^Phenylalanine, G=GIycine, ENHistidine, 
I-Isoleudne, K«Lysinc, L^Leucine, M~Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Scrine, 
T=Threoninc, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide ddetion, 
V=possible nucleotide insertion 










EKDNEELTISCPVPADEQTECTSAQQLNMSTSSP 

CAAELTALSTTQQEKDLEEDSSVQKDGLNQTTIP 

VSPPSTTKPSRASTASPCNNNINAATAVALQEPR 

KLSYAEVCQKPPKEPSS VLVQPLRELRSNVVSPT 

KNEDNGAPENSVEICPHEKPEARASKDYSGFRGN 

IIPRGAAGKIREQRRQFSHRAIPQGVTRRNGKEQ 

YVPPRSPK 


3353 


A . 


1054 


587 


IATPTWTAPLTATPTPAHQYGPARVPNGAPRLEP 
PPGKRECRVGQYVVDLTSFEQLALPVLRNADCS 
SGPGQRVCVIDEIGKMELFSQLFIQAVRQTLSTPG 
TIILGTffVPKGKPLALVEEIRI^JDVKVFNVTKE 
NRNHLLPDIVTC VQ S SRK 


3354 


A 


56 


1268 


GMEPVGCCGECRGSSVDPRSTFVLSNLAEWER 

VLTFLPAKALLRVACVCRLWRECVRRVLRTHRS 

VTWISAGLAEAGHLEGHCLVRWAEELENVRJDLP 

mVLYMADSETFISLEECRGHKRARKRTSMETA 

LALEKLFPKQCQVLGIVTPGIVVTPMGSGSNRPQ 

EIEIGESGFALLFPQIEGIKIQPFHFIKDPKMLTLER 

HQLTEVGLLDNPELRVVLVFGYNCCKVGASNYL 

QQWSTFSDMNIILAGGQVDNLSSLTSEKNPLDI 

DASGWGLSFSGHRIQSATVLLNEDVSDEKTAEA 

AMQRLKAANIPEHNTIGFMFACVGRGFQYYRAK . 

GNVEADAFRKFFPSVPLFGFFGNGEIGCDRIVTG 

NFILRKCNEVKDDDLFHSYTTIMALIHLGSSK 


3355 . 


A 


1 , 


707 


GTSSGLGGDRLAAPGPSPPSFYPQGRGERAYDIY 

SRLLRERJVCVMGPIDDSVASLVIAQLLFLQSESN 

KKPIHMYINSPGGVVTAGLAIYDTMQYILNPICT 

WCVGQAASMGSLLLAAGTPGMRHSLPNSR1MIH 

QPSGGARGQATDIAIQAEEIMKLKKQLYNIYAKH 

TKQSLQVIESAMERDRYMSPMEAQEFGILDKVL 

VHPPQDGEDEPTLVQKEPVEAAPAAEPVPAST 


33.56 


■A ■ 


352 


338 


FNYNFCRNLHMPSFLV * PGMCGLL AKHLSFHTVG 
AFLIT/LGVAALCKFAVA*PRKKAYADFYRNTYN* 
IKEFEVRKAN1SQSTK 


.3357 


A 


1 


403 


algscggllgtgllkgtmsgtlwskg3fagykr 
ririqrehtavlkiegwyardetefylrmicanv 
ykankktvtpvltpdktrvmwrkvtqahgisi 
m\ouqfr™lpadaighrirmml*psrmytteps 


3358 


A 


71 


2897 


FCSKDKCCLYLPDSINRSKSCTAKPGAHSQDRHA 

VMDSERQVKDTDDIESPKRSIRDSGYIDCWDSER 

SDSLSPPRHGRDDSFDSLDSFGSRSRQTPSPDWL 

RGSSDGRGSDSESDLPHRiaPDVKKDDMSARRT 

SHGEPKSAVPFNQYLPNKSNQTAYVPAPLRKKK 

AEREEYRKSWSTATSPAGLGBCKALQDYGPRTVPV 

S\DDAESTSMFDMRCEEEAAVQPHSRARQEQLQ 

LINNQLREEDDKWQDDLARWKSRKRSVSQDLIK 

KEEERKKMEKLLAGEDGTSERRKSIKTYREIVQE 

KERRERELHEAYKNARSQEEAEGILQQYIERFTIS 

EAVLERLEMPKILERSHSTEPNLSSFLNDPNPMK 

YLRQQSLPPPKFTATVETTIARASVLDTSMSAGS 

GSPSKTVTPICAVPMLTPKPYSQPKNSQDVLKTFK 

VDGKVSVNGETVHREEEKERECPTVAPAHSLTK 

SQMFEG VARVHG SPLELKQDNGSIEINIKKPNS V 

PQELAATTEKTEPNSQEDKNDGGKSRKGNIELAS 

SEPQHFTTTVTRCSPTVAFVEFPSSPQLKNDVSEE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
: nucleotide 
location 

to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A-Alamne <>Cystcine,D=Aspartic Acid, " 
E«Glutamic Acid, ^Phenylalanine, OGlycine, H=Histidine, 
i=Iso)cucine, K=Lysine, t=Leucine, M^Mcthionine, 
N=Asparagine, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T=Thrconine, V=Valine, W«=Tryptophan, Y«Tyrosine, 
X^Unknown, *=Stop codon, /=possible nucleotide deletion, 
\-possibIe nucleotide insertion 










KDQKKPENEMSGKVELVLSQKWKPKSPEPEAT 

LTFPFLDKMPEANQLHLPNLNSQVDSPSSEKSPV 

TTPFKFWAWDPEEERRRQEKWQQEQERLLQER 

YQ\KEQDK\LKEE\WEKAQKEVEEEERRYYEEEP* . 

n\EDPWPFTVSSSSADQLSTSSSMTEGSGTMNKI 

DLGNCQDEKQDRRWKKSFQGDDSDLLLKTRES 

uivuc,iiiVUc>JL. l iiOAJLArlbvjNrVSKG VHEDHQLDT 

EAGAPHCGTNPQLAQDPSQNQQTSNPTHSSEDV 

KPKTLPLDKSINHQIESPSERRXS1SGKKLCSSCGL 

PLGKGAAMIIETLNLYFHIQCFRCGUCKGQLGDA 

v ou xu viviJKjNUL/JLiNUiNJJU YJylKoKoAGQPi 1 JL 


3359 


A 


3 


368 


EVTASREGRGACAWECGSSRGPWGLLRGTFAPV 
RAATP*S*LPKGSLRHRP*/CPPPVHLPPKSSCPPR ■ 
AWAGRATSM*TSSYSSEYQPQTP*ALVTLPPRSY 

VT T TUT T TT TUT TJTJrMT tt:t> 


3360 


A 


2 


392 


ARGIGSLGRDHSGSGGGTGMAGAWVRKAADYV 
RSKDFRDYLMSTHFWGPVANWGLP1AAITDMK\ 
KSPEIISRRMTFAL*CYSLTFVRFAHYVQ\PWNWL 
MLGCHTAVDFDQLISSMPCISHGMTASASAL 


3361 


A 


4619 


532 


LLLGRANSPPYNSVVRTLPPATLLLRRAGWESF 

WSCQSRSPWPPRPEVRAPAKGPRGVAGAAGACS 

AGARLGDAAGGDPASGQAARGCGARAPRGLGR 

TARARDTAMEDAGAAGPGPEPEPEPEPEPEPAPE 

PEPEPKPGAGTSEAFSRLWTDVMGILDGSLGNID 

DLAQQYADYYNTCFSDVCERMEELRKRRVSQD 

LEVEKPDASPTSLQLRSQIEESLGFCSAVSTPEVE . 

RKNPLHKSNSEDSSVGKGDWKKKNKYFWQNFR 

KNQKGIMRQTSKGEDVG YVASEITMSDEERIQL 

MMMVKEKMITIEEALARLKEYEAQHRQSAALDP 

ADWPDGSYPTFDGSSNCNSREQSDDETEESVKF 

KRLHKLVNSTRRVRKKLIRVEEMKKP\STEGGEE 

HVFENSPVLDERSALYSGVHICKPLFFDGSPEKPP 

EDDSDSLTTSPSSSSLDTWGAGRKLVKTFSKGES 

RGLIKPPKKMGTFFSYPEEEKAQKVSRSLTEGEM 

KKGLGSLSHGRTCSFGGFDLTNRSLHVGSNNSDP 

MGKEGDFVYKEVDCSPTASRISLGKKVKSVIOET 

MRKRMSKKYSSSVSEQDSGLDGMPGSPPPSQPD 

PEHLDKPKLKAGGSVESLRSSLSGQSSMSGQTVS 

TTDSSTSNRESVKSEDGDDEEPPYRGPFCGRARV 

HTDFTPSPYDTDSLKLKKGDIIDIJSICPPMGTWMG 

LLNNKVGTFNFIYVDVLSED\EEKPKRPTRRRRK 

GRPPQPKSVEDLLDRJNLKEHMPTFLFNGYEDLD 

TFKLLEEEDLDELNIRDPEHRADLLTAVELLQEY 

DSNSDQSGSQEKLLVDSQGLSGCSPRDS*CYESS 

ENLENGKTRKASLLSAKSSTEPSLKAFSRNQLGN 

YPTLPLMKSGDALKQGQEEGRLGGGLAP\DTSKS 

CDPPGC*LVLN\KNRRKPPSFPSCRSC\ETL\EGPQ 

TVDTWPRSHSLDDLQVEPGAEQDVPTEVTEPPPQ 

IVPEVPQKTTASSTKAQPLEQDSAVDNALLLTQS 

KRFSEPQKLTTICKLEGSIAASGRGLSPPQCLPRNY 

DAQPPGAKHGLARTPLEGHRKGHEFEGTHHPLG 

TKEGVDAEQRMQPKIPSQPPPVPAKKSRERLANG 

LHPVPMGPSGALPSPDAPCLPVKRGSPASPTSPSD 

CPPALAPRPLSGQALGSPPSTRPPPWLSELPENTS 

LQEHGVKLGPALTR\KVSCARGVDLETLTENKL\ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine OCysteine, D=Aspartic Acid,. 
E=G!utamic Acid, ^Phenylalanine, G=GIycine, H=Histidine, 
I=JsoIeucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q=Glutamine, R«=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y^Tyrosine, 
X=Un known, *=Stop cod on, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










HAEGIRSSRREPYS*LRHGRCGI\P\EALVQRYAED 
LDQPERDVAANMDQIRVKQLRKQHRMAIPSGGL 
TEICRKPVSPGCIS\SVSDWLIS1GLPMYAGTLSTA 
GFSTL\SQVPSLSHTCLQEAG\ITEERHIRK\LLSAA 
RLFKLPPGPEAM 


3362 


A 


1 


4653 


FRGGVGYAHTLHLLPFAGSSVVLARARRTDRWT 
SGLVEMATLSLTVNSGDPPLGALLAVEHVKDDV 
SISVEEGKENILHVSENVIFTDVNSILRYLARVAT 
TAGLYGS^MEHTEEDHWLEFSATKLSSCDSFTS 
TINELNHCLSLRTYLVGNSLSLADLCVWATLKG 
NAAWQEQLKQKKAPVHVKRWFGFLEAQQAFQS 
VGTKWDVSTTKARVAPEIGCQDVGKFVELPGAE 
MGKVTVRFPPEASG YLHIGHAKAALLNQHYQV 
NFKGKL1MRFDDTNPEKEKEDFEKVILEDVAML 
H1KPDQFTYTSDHFETIMKYAEKLIQEGKAYVDD 
TPGEQIKAEREQRIESKHRKNPIEKNLQMWEEMK 
KGSQFGHSCCLRAKIDMSSNNGCMRDPTLYRCK 
IQPHPRTGN*Y\NV\YPTYDFACPIVDS1EGVTHAL . 
RTTEYHDRDEQFYWIIEALGIRKPYIWEYSRLNL 
NNWLSKRKLTWFVNEGLVDGWDDPRFPTVRG 
VLRRGMTVEGLKQFIAAQGSSRSVVNMEWDKI 
WAFNKKVIDPVAPRYVALLKKEVIPVNVPEAQE 
EMKEVAKHPKNPEVGLKPVWYSPKVFIEGADAE 
TFSEGEMVTFINWG>n^MTKIHKNADGKIISLDAK 
LNLENKDYKKTTKVTWLAETTHALPIPVICVTYE 
HLITKPVLGKDEDFKQYVNKNSKHEELMLGDPC 
LKDLKKGD1IQLQRRGFFICDQPYEPVSPYSCKEA 
PCVLIYIPDGHTKEMPTSGSKEKTKVEA-IKNETS 
APFKERPTPSLNNNCTTSEDSLVLYNRVAVQGD 
VVRELKAKKAPKEDVDAAVKQLLSLKAEYKEK 
TGQEYKPGNPPAEIGQNISSNSSASILESKSLYDE 
VAAQGEVVRKLKAEKSPKj^JONEAVECLLSLICA 
QYKEKTGKEYIPGQPPLSQSSDSSPTRNSEPAGLE 
TPEAKVLFDKVASQGEVVRKLKTEKAPKDQVDI 
AVQELLQLKAQYKSLIGVEYKPVSATGAEDKDK 
KKKEKENKSEKQNKPQKQNDGQRKDPSKNQGG 
GLSSSGAGEGQGPKKQTRLGLEAKKVEENLADW 
YSQVITKSEMIEYHDISGCYILRPWAYAIWEAIKD 
FFDAEIKKLGVENCYFPMFVSQSALEKEKTHVA 
DFAPEVAWVTRSGKTELAEPIAIRPTSETVMYPA 
YAKWVQSHRDLPIKLNQWCNVVRWEFKHPQPF 
LRTREFLWQEGHSAFATMEEAAEEVLQILDLYA 
QVYEELLAIPWKGRKTEKEKPAGGDYTTTIEAF 
ISASGRAIQGGTSrfflLGQNFSKMFErVFEDPKJPG 
EKQFAYQNSWGLTTRTiGVMIMVHGDNMGLVL 
PPRVACVQW1IPCGITNALSEEDKEALIAKCNDY 
RRRLLSVNIRVRADLRDNYSPGWKFNHWELKG 
. VPIRLEVGPRDMKSCQFVAVRRDTGEKLTVAEN 
EAETT<XQAILED1QVTLFTRASEDLKTHMVVANT 
MEDFQKILDSGKIVQIPFCGEnDCEDWIKKTTARD 
QDLEPGAPSMGAKSLCPFKPLCELQPGAKCVCG 
- KNPAK YYTLFGRS Y , 


3363 


A . 


3797 


1514 


LGGAAPETMPFPVTTQGSQQTQPPQKHYGITSPIS 

LAAPKETDCVLTQK\LAETLKPFGGFLKKEEGTA 

SRRNFNFGKN*INLVKEWIRR^^ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIaninc OCysteine, D=Aspartic Acid, 
E^GJutamic Acid, F=Phcny lata nine, G=Glydnt, H=Histidine, 

I K Iso leucine* l^TesT.vcini* T ,r=T j>nr>i up 7Ws=IVlMhinninp 

N=Asparagine,P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T^Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










ENV\GGKIFT/FLGSYRL/GEVHTKGADIDGVCVF 

APRHVDRSDFFTVSFYDKLKLQEEVKDLRAVEEA 

FVPVIKXCFDGIEIDILFARLALQTIPEDLDLRDDS 

LLKM.DIRCIRSLNGCRVTDEILHLVPN1DNFRLT 

LRAIKLWAKRHNIYSNILGFLGGVSWAMLVART 

CQLYPNAJASTLVHKFFLVFSKWEWPNPVLLKQP 

EECNLNLPVWDPRVNPSDRYHLMPIITPAYPQQN 

STYNVSVSTRMVMVEEFKQGLAITOEILLSKAE 

WSKLFEAPNFFQKYKHY1VLLASAPTENQRLEW 

VGLVESKIRILVGSLEKNEFITLAHVNPQSFPAPK 

ENPDKEEFRTMWVIGLVFKKTENSENLSVDLTY 

DIQSFTDTVYRQAINSKMFEVDMKJAAMHVKRK 

QLHQLLPNHVLQKKKKHSTEGVKLTALNDSSLD 

LSMDSDNSMSVPSPTSATKTSPLNSSGSSQGRNS 

PAPAVTAASVTNIQATEVSVPQVNSSESSGGTSSE 

SIPQTATQPAISPPPKPTVSRVVSSTRLVNPPPRSS 

GNAATSGNAATKIPTPIVGVKRTSSPHKEESPKK 

TKTEEDETSEDANCLALSGHDKTEAKEQLDTETS 

TTQSETIQTAASLLASQKTSSTDLSDIPALPANPIP 

VIKNSIKLRLNR 


3364 


A . . 


54 


3073 


SARTMSYDYHQNWGRDGGPRSSGGGYGGGPAG 

GHGGNRGSGGGGGGGGGGRGAVQGPASRAPER 

PRNRHWREKTGAEEQAVKRRGKREL/LVHMDE 

RREEQIVQLLNSVQAKNDKESEAQISWFAPEDHG 

YGTEVSTK^TPCSEM<1DIQEKKLTNQEKKMFRI 

RNRSYIDRDSEYLLQENEPDGTLDQKLLEDLQKK 

ICTOLRYmMQHFREKLPSYGMQKELVNLIDNHQ 

VTVISGETGCGKTTQVTQFILDNYIERGKGSACRI 

VCTQPRRISAISVAERVAAERAESCGSGNSTGYQI 

RLQSRLPRKQGSILYCTTGIILQWLQSDPYLSSVS 

mVLDEIHER^QSDXO.MTVVKDLLNFRSDLKVI 

LMSATLNAEKFSEYFGNCPMIHIPGFTFPWEYLL 

EDVIEKIRYVPEQKEHRCQFKRGFMQGHVNSQE . 

KEEKEAIYKERWPDYVRELRRRYSASTVDVIEM 

MEDDKVDLNLIVALIRYIVLEEEDGAILVFLPGW 

DNISTLHDLLMSQVMFKSDKFLHPLHSLMPTVN . 

QTQVFKRTPPGVRKIVIATNiAETSITIDDVYYVID 

GGKIKETHFDTQNNISTMSAEWVSKANAKQRKG 

RAG\RVQPGSLLFICINGS*EASLLGWTIQLPEIF/R 

GTPLEELCLQIKVLRLGGI/GLFLSRLMDPPSNEA 

VLLSIRQL\RSLNALDKQEELTPLGVHLARLPVEP 

HIGKMILFGALFCCLDPVLTIAASLSFKDPFVIPLG 

KEKIADARRKELAKDTRSDHLTVVNAFEGWEEA 

RRRGFRYEKDYCWEYFLSSNTLQMLHNMKGQF 

AEHLLGAGFVSSRNPKDPESNINSDNEKIIKAVIC 

AGLYPKVAKIRLNLGKKRKMVKVYTKTDGLVA 

VHPKSWVEQTDFHYNWLIYHLKMRTSSIYLYD . 

CTEVSPYCLLFFGGDISIQKDNDQET1AVDEWIVF 

QSPARIAHLVKRAVVHMDERREEQIVQLLNSVQ 

AKNDKESEAQISWFAPEDHGYDKKYFFKE 


3365 


A 


439 


878 


ECCNVRPLRETDLLKMKRKPRASSPVVEEQPRA 

NTKJBTRKKKSFSQPMSASTKEESQDGRRKGK*L 

KGRARKKNAPQKSMALRJLEEGSRPTPSGHSDQL 

NEEL*QNELQLEQ/PEGT*LEQQSEGTQPEQQSGR 

MPTISTLSLSSE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino . 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AJanine OCysteine, D=Aspartic Acid, 
E=GIutamic Acid, ^Phenylalanine, G=GIycine, H-Histidine, 
I— Isoleucine, K = Lysine, L^Leucine, ]Vf —Methionine, 
N=Asparagine, P=ProIine, Q=Glutaniine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possibtc nucleotide insertion 


3366. 


A 


1 


827 


FRGYWGVREAFTDASWSGGLGPGKPGMKITRQ 

KHAKKHLGFFRNNFGVREPYQILLDGTFCQAAL 

RGRIQLREQLPRYLMGETQLCTTrlCVLKELETLG 

KDLYGAKLIAQKCQVRNCPHFKNAVSGSECLLS 

MVEEGNPHHYFVATQDQNLSWVKKKPGVPLM - 

FnQNTMVLDKPSPKTIAFVKAVESG\RI,SQCMRK 

KVSNISKRNRV* *KTLNRGRRKKRKK1SGPNPLS 

CLKX1CKXAPDTQSSASEKKRKI0CRIRNRS>IPKV 

LSEKQNAEGE 


3367 


A 


40. 


1467 


MLWGCRAKACWGPRLSDLVASLSPQRECISVHV 

GQAGVQIGNACWELFCLEHGIQADGTFDAQASK 

INDDDSFTTFFSETGNGKHVPRAVMriDLEPTVVD 

EVRAGTYRQLFHPEQLITGKEDAANNYARGHYT 

VGKESIDLVLDRTRKLTDACSGLQGFLIFHSFGGG 

TGSGFTSLLMERLSLDYGKKSKLEFAIYPAPQVS 

TAVVEPYNSILTTHTTLEHSDCAFMVDNEAIYDI 

CRRNLDIERPTYTNLNRLISQIVSSITASLRFDGAL 

NVDLTEFQTNLVPYPRIHFPLVTYAPIISAEKAYH 

EQLSVAEITSSCFEPNSQMVKCDPRHGKYMACC 

MLYRGDVVPKDVNVAIAAIKTKRT1QFVDWCPT 

GFKVGINYQPPTVVPGGDLAKVQRAVCMLSNTT 

AIAEAWARLDHKFDLMYAKRAFVHWYVGEGM 

EEGEFS*RPGEDLA\ALE\KDYEEVGTDSFEEENE 

GEEF 


3368 


A 


3 


2597 . 


SLLEETMDEDSSLREYTVSLDSDMDDASKCLQE 

YDSGTGNTREALRPCPRTVSTKAQPGRSASSSSG 

DKTTSFAEQKIRKLNHTDGESSGSSSQKTTPEGSE 

LNIPHAGAWAQIPEETGLPQGRDTTQLLASEMV 

HLMMK\LKEKR\RAI*AQKKKMEAAFTKQRQKM . 

GRTAFLTVVKKKGDGISPLREEAAGAEDEKVYT 

DRAKEKESQKTDGQRSKSLADIKESMENPQAKW 

LKSPTTPIDPEKQGNLASPSEETLNEGEILEYTKSI 

EKLNSSLHFLQQEMQRLSLQQEMLMQMREQQS 

WVISPPQPSPQKQIRDFKPSKQAGLSS A1APFSSDV 

SPR\PTHPSSTSLLNRKSASFSVKSQRTPRPNELKI 

TPLNRTLTPPRSVDSLPRLRRFSPSQVPIQTRSFVC 

FGDDGEPQLKESKPKEEVKKEELESKGTLEQRG 

HNPEEKEIKPFESTVSEVLSLPVTETVCLTPNEDQ 

LNQPTEPPPKPVFPPTAPKNVNLIEVSLSDLKPPE 

KADVPVEKYDGESDKEQFDDDQKVCCGFFFKD 

DQKAENDMAMKRAALLEKRLRREKETQLRKQQ 

LEAEMEHKKEETRRKTEEERQKKEDERARREFIR 

QEYMRRKQLKLMEDMDTVIKPRPQVVKQKKQR 

PKSIHRDHEESPKTPIKGPPVSSLSLASLNTGDNES 

VHSGKRTPRSESVEGFLSPSRCGSRNGEKDWEN 

ASTTSSVASGTEYTGPKLYKEPSAKSNKHIIQNAL 

AHCCLAGKVNEGQKKKILEEMEKSDANNFLILF 

RDSGCQFRSLYTYCPETEEINKLTGIGPKSITKKM. 

IEGLYKYNSDRKQFSHIPAKTLSASVDAITIHSHL 

WQTKRPVTPKKLLPTKA 


3369 


A 


977 


594 


RGSGLTQEPGSVGQLALACAEGAVEWLYPAGAL 
RLTLGGPDPRARPGIACLRPVRPFAGAQVFAERA 
GGALELLLAEGPGPAGGRCVRWGPRERRALFLQ 
ATPHQDISRRVAAFRFELREDGRPEIAP 


3370 


A 


345 


1383 


DLSLECTGFKETNLGVYFLSSKWVLRLYALHIID 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
- nucleotide 
location 

corresponding . 
to last amino 
acid residue or 
peptide 
sequence 


Amino acid sequence (A=*Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutaraic Acid, F=Phenylalanine, G=GIycine, H=His tiding 
I=Isoleucinc, K=Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^Serine, 
T^Tbreonine, V=VaIine, W=Tryptophan, Y^Tyrosine, 
X=Un known, *«Stop codon, A=possible nucleotide deletion, 
V==possibIe nucleotide insertion 










YSAVLFPC*AMDHLESFIAECDRRTELAKKRLAE 

TQEEISAEVSAKAEKVHELNEEIGKLLAKAEQLG 

AEGNVDESQKILMEVEKVRAKKKEAEKTVAEK 

QEKRNQDRLRRREEREREERLSRRSGSRTRDRRR 

SRSRDRRRJRRSRSTSRERRKLSRSRSRDRHRRHR 

SRSRSHSRGHRRASRDRSAKYKFSRERASREESW 

ESGRSERGPPDWRLESSNGKMASRRSEEKEAG/G 

DLLNRMIVWKHGLLI 


3371 


A 


345 


1383 


DLSLECTGFKETNLGVYFLSSKWVLRLYALHIID 

YSAVLFPC*AMDHLESFIAECDRRTELAKKRLAE 

TQEEISAJEVSAKAEKVHELNEEIGICLLABLAEQLG 

AEGNVDESQKILMEVEKVRAKKKEAEKTVAEK 

QEKRNQDRLRRREEREREERLSRRSGSRTRDRRR 

SRSRDRRRRRSRSTSRERRXLSRSRSRDRHRRHR 

SRSRSHSRGHRRASRDRSAKYKFSRERASREESW 

ESGRSERGPPDWRLESSNGKMASRRSEEKEAG/G 

DLLNRMIVWKHGLLI 


3372 


A 


239 


3348 . 


PMQNCMCSLTLSVLPLGPQPPVPEKRPPEIQHFR 
MSDDVHSLGKVTSDLAKRRKLTS\*GGLSEELGS 
ARRSGEVTLTKGDPGSLEEWETWGDDFSLYYD 
- SYSVDERVDSDSKSEVEALTEQLSEEEEEEEEEEE 
EEEEEEEEEEEEEDEESGNQSDRSGSSGRRKAKK 
KWRKDSPWVKPSRKRRKREPPRAKEPRGVNGV 
GSSGPSEYMEVPLGSLELPSEGTLSPNHAGVSND 
TSSLETERGFEELPLCSCRMEAPKIDRISERAGHK 
CMATESVDGELSGCNAAILKRETMRPSSRVALM 
VLCETHRARMVKHHCCPGCGYFCTAGTFLECHP 
DFRVAHRFHKACVSQLNGMVFCPHCGEDASEA 
QEVTIPRGDGVTPPAGTAAPAPPPLSQDVPGRAD 
TSQPSARMRGHGEPRRPPCDPLADTIDSSGPSLTL 
PNGGCLSAVGLPLGPGREALEKALVIQESERRKK 
LRFHPRQLYLSVKQGELQKVILMLLDNLDPNFQS 
DQQSKRTPLHAAAQKGSVEICHVLLQAGANINA 
VDKQQRTPLMEAVVNNHLEVARYMVQRGGCV 
YSKEEDGSTCLHHAAKIGNLEMVSLLLSTGQVD 
VNAQDSGGWTPIIWAAEHKHIEVIRMLLTRGAD 
VTLTDNEENICLHWASFTGSAAIAEVLLNARCDL 
HAVNYHGDTPLHIAARESYHDCVLLFLSRGANP 
ELRNKEGDTAWDLTPERSDVWFALQLNRKLRL 
GVGNRAIRTEKIICRDVARGYENVPIPCVNGVDG 
EPCPEDYKYISENCETSTMNmRMTHLQHCTCV 
DDCSSSNCLCGQLSIRCWYDKDGRLLQEFNKIEP 
PLIFECNQAGSCWRNCKNRVVQSGIKVRLQLYR 
TAKMGWGVRALQTIPQGTFICEYVGELISDAEAD 
VREDDSYLFDLDNKDGEVYCIDARYYGNISRFIN 
HLCDPNIIPVRVFMLHQDLRFPRIAFFSSRDIRTGE 
ELGFDYGDRFWDEKSKYFTCQCGSEKCKHSAEAI 
ALEQSRLARLDPHPELLPELGSLPPVNT 


3373 


A 


587 


1584 


PDGRLIVSCSEDKTIKIWDTTNKQCVNNFSDSVG 

FANFVDFNPSGTCIAS AGSDQTVKVWDVRVNKL 

LQHYQVHSGGWCISFHPSGNYTJTrASSDGTLKIL 

DLLKGRLIYTLQGHTGPVFTVSFSKGGELFASGG 

ADTQVLLWRTNFDELHCKGLTKRNLKRLHFDSP 

PHLLDIYPRTPHPHEEKVETVEDFFLHLLRLIQSL 

R* SICRSLLPLL WISFLLILPQQQKP WGLCQTRV 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 

r»*nfifii» 

sequence 


Amino acid sequence (A^Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G^GIycinc, H^Histidine, 
I=Isoleucine, K— Lysine, L= Leu cine, M— Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T^Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 

it^jjumiuic uugicuuui. iiidCr uuu 










KRPVDIS*TLP*CHQNVCQQPRKRKQKT*VTSPV 

KVK7VSIPLAVTDALEHIMEQLNVLTQTVSILEQR 

LTLTEDKLKDCLENQQKLFSAVQQKS 


3374 


A 


398 


21 


WLYPMALSILDIKMSPSWYFHMAIGIINWNTTAG 
LSGTLYPKVPQKYILFDSVILLLGMLRKIRQVCQ 
NVYMKGCSPITLFKIVHYWPGAVAHAYNPSTLG 
GQVG/WQIT* GQEFETSLDYMVKPHLY 


3375. 


A 


3 


1051 


VPTQQELAFPEQTNTKDWTVTPEHVLPESQSLLT 

FEEVAMYFSQEEWELLDPTQKALYNDVMQENY 

ETVISLALFVLPKPKVISCLEQGEEPWVQVSPEFK 

DSAGKSPTGLKLKNDTENHQPVSLSDLEIQASAG 

VISKKAKVKVPQKTAGKENHFDMHRVGKWHQ 

DFPVKKRKKLSTWKQELLKLMDRHKKDCAREK 

PFKCQECGKTFRV SS\DL\IKHQRIHTEEBCPYKCQ 

QCDKRFRWSSDLNKHLTTHQGIKPYKCSWGGKS 

FSQNTNLHTHQRTHTGEKPFTCHECGKXFSQNS 

HLIKHRRTHTGEQPYTCSICRRNFSRRSSLLRHQK 

LHL*REACPVSHFWKTF 


3376 


A 


137. 


2329 


SFESPAPLPSTCFPQERQDPGPCYVSGAMAGLGP 
GVGDSEGGPRPLFCRKGALRQKWHEVK5HKFT 
ARFFKQPTFCSHCTDFIWGIGKQGLQCQVCSFVV 
HRRCHEFVTFECPGAGKGPQTDDPRNKHKFRLH 
SYSSPTFCDHCGSLLYGLVHQGMKCSCCEMNVH 
RRCVRSVPSLCGVDHTERRGRLQLEIRAPTADEI 
HVTVGEARNLIPMDPNGLSDPYVKLKLIPDPRNL 
TKQKTRTVKATLNPVWNETFVFNLKPGDVERRL 
SVEVWDWDRTSRNDFMGAMSFGVSELLKAPVD 
GWYKLLNQEEGEYYNVPVADADNCSLLQKFEA 
CNYPLELYERVRMGPSSSPPSPSPSPIDPKRCFFG 
ASPGRLrHSDFSFLMVLGKGSFGKVMLAERRGSD 
ELYAIIGLKKDVIVQDDDVDCTLVEKRVLALGG 
. RGPGGRPHFLTQLHSTFQTPDRLYFVMEYVTGG 
DLMYHIQQLGKFBCEPHAAFYAAEIAIGLFFLHNQ 
GIIYRDLKLDNVMLDAEGHIKITDFGMCKENVFP 
GTTTRTFCGTPDYIAPEIIAYQPYGKSVDWWSFG 
VLLYEMLAGQPPFDGEDEEELFQAIMEQTVTYP 
KSLSREAVAICKGFLTKHPGEAPGASGP*WGNLT 
IRAHGFFPLGFDWERLERL\EIPASFSRPRPCGPQR 
RGIFDKFFTRAAPA\LTPPARLVLDSIDQADFQGF 
TYVNPDFVQPDARSPTSTVHVPVM 


3377 


A 


918 


738 


SSMLWGFSVFRRSWILNCAVLSSSQVGISAACiaFS 
TLTHTHTHTHTHTRHAPFCGTCLYY 


3378 


A 


1126 


456 


FSKLIMKTFIIGISGVTNSGKTTLAKNLQKHLPNC 

SVISQDDFFKPESEIETDKNGFLQYDVLEALNME 

KMMSAISCWMESARHSVVSTDQESAEEIPILIIEG 

FLLrWKPLDTIWNRSYFLTIPYEECKRKIlSTRVY 

QPPDSPGYFDGHVWPMYLKYRQEMQDITWEVV 

YLDGTKSEEDLFLQVYEDLIQELAKQKCLQVTA* 

RRNTTNPS/CK*IRKLQGVI 


3379 


A 


1126 


456 


FSKLIMKTFIIGISGVTNSGKTTLAKNLQKHLPNC 

SVISQDDFFKPESEIETDKNGFLQYDVLEALNME 

KMMSAISCWMESARHSVVSTDQESAEEIPILEEG 

FLLFNYKPLDTIWNRSYFLTIPYEECKRRRSTRVY 

QPPDSPGYFDGHVWPMYLKYRQEMQDITWEVV 

YLDGTKSEEDLFLQVYEDLIQELAKQKCLQVTA* 
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SEQI0 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCystcine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, M^Methionine, 
N«Asparagine, P=Proline, Q=Glutamine, R=Arginine,S=Serine, 
T=Threonine, V*=VaHne, W=Tryptophan, Y*=Tyrosine, 
X~Unkno\vn, *=Stop codon,/=possib!e nucleotide deletion, 
^possible nucleotide insertion 










RRNTTOPS/CK*IRKLQGVI 


.3380 


A 


1443 


794 


ARRGELAGGGRASGGRSGGDGGGGGGARAPEG 

VRAPAAGQPRATKGAPPPPGTPPPSPMSSAIERKS 

LDPSEEPVDEVLQIPPSLLTCGGCQQNIGDRYFLK 

AIDQYWHEDCLSCDLCGCRLGEVGRRLYYKLGR 

KLCRJRDYLRLFGQDGLCASCDKRIRAYEMTMRV 

KDKVYHLECFKCAACQKHFCVGDRYLLINSDIV 

CEQDIYEWTKINGMI 


3381 . 


A 


945 


.474 . 


SLKLRKPPLPTDGVHFVFVESQLDFWGPQEMLT 
QQGMALQNYDNKLVKCIEELCQKQEELCWQIQ 
QEEDKKQRLQNEVRQLTEKLACVNEKLARVNE 
NLARKIASCSKFYQTIAETEATYLKILESF*\TLLS 
VRKREAGNLTXATAPDQKSSGGRDS 


3382 


A 


1 


1458 


GIRGKMADRGGVGEAAAVGASPASVPGLNPTLG 

WRERLRAGLAGTGASLWFVAGLGLLYALRIPLR 

LCENLAAVTVFLNSLTPKFYVALTGTSSLISGLIFI 

FEWWYFHKHGTSFIEQVSVSHLQPLMGGTESSIS 

EPGSPSRNRENETSRQNLSECKVWRNPLNLFRGA 

EYRRYTWVTGKEPLTYYDMNLSAQDHQTFFTC 

DTDFLRPSDTVMQKAWRERNPPARIKAAYQALE 

LN/E*LCHCICSTG*GRSNNYCRC*KVI*TGTQGR 

RNNL*AVTAVPAPKSSA*SSTEERYQCTGIY*LKI 

GN VCKKIRKNKRS S KNNERFDE^ IS S S YHVEHP * . 

KSL\KSLLELQAYPDVQAVLAKYDDISLPKSAAIC 

YTAALLKTRTVSEKFSPETASTRGLSAAEINAVD 

AIHRAVEFNPH VPKYLLEMKSLILPPEHILKRGD S 

EAIAYAFFHLQHWKRIEGALNLLQCTWEGSKYS 

FPKVTLISLTIH 


3383 - 


A 


282 


2443 . 

c 


RGKGFKEFFLGVCQTFIPCLCAEGIQLQFFCSGSG 
SSPLLKDLESMKTGLFFLCLLGTAAAIPTNARLLS 
DHSKPTAETVAPDNTAIPSLRAEAEENEKETAVS 
" TEDDSHHKAEKSSVLKSKEESHEQSAEQG\KSS\S 
QELGIEGFKRDSDGSL*VWNL\EYGTNLKGTLDI 
KEDMSEPQEKKLSENTDFLAPGVSSFTDSNQQES 
ITKREENQEQPRNYSHHQLNRSSKHSQGLRDQG 
NQEQDPNISNGEEEEEKEPGEVGTHNDNQERKTE 
\LPREHANSKQEEDNTQSDDILEESDQPTQVSKM 
QEDEFDQGNQEQEDNSN AEMEEENASNVNKHIQ 
ETEWQSQEGKTGLEAISNHKETEEKTViSEALLME 
PTDDGNTTPRNHGVDDDGDDDGDDGGTDGPRH 
SA\SDDYFHPKPGLFWEAERA\HSIAYSPSKLREQ 
REKVHENEMGTTEPGEHQEAIOCAENSSNEEETS 
SEGNMRVVHAVDSCMSFQCKRGfflCKADQQGKT 
SLVSCQDPVTVCPPTKPLDQVCGTDNQTYASSCH 
LFATKCRLEGTKKGHQLQLDYFG\ASKSIPTVCRD 
FEVIQ\FPLRMRDW\LKNILMQLYEANSEHAGYL 
NEK\QRNKVKKIYL\DEKRLLAGDHPIDLLLRDFK 
KNYHMYVYPVHWQFSELDQHPMDRVLTHSELA 
PLRASLVPMEHCITRFFEECDPNKDKHITLKEWG 
HCFGIKEEDIDENLLF 


jjo4 


A 


3166 


928 


PSRPHPTHAAMAGPEGFQYRALYPFRRERPEDLE 
LLPGDVLVVSRAALQALGVAEGGERCPQSVGW 
MPGLNERTRQRGDFPGTYVEFLGPVALARPGPR 
PRGPRPLPARPRDGAPEPGLTLPDLPEQFSPPDVA 
PPLLVKLVE AIERTGLD SESHYRPELPAPRTD WSL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine C=Cysteine, D=Aspartic Acid, 
E=G!utamic Acid, F=PhenyIalanine, G=Glycine, H=Histidine, 
I=Isoleucine, KpLysine, L=Lcucine, M— Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T«=Thrconine, V=Valinc, W=Tryptophan, Y«=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
^possible nucleotide insertion 










SDVDQWDTAALADGIKSFLLALPAPLVTPEASAE 

ARRALREAAGPVGPALEPPTLPLHRALTLRFLLQ 

HLGRVASRAPALGPAVRALGATFGPLLLRAPPPP 

SSPPPGGAPDGSEPSPDFPALLVEKLLQEHLEEQE 

VAPPALPPKPPKAK\PASTVPGPNGGSPPSL\QDA 

EWYWGDMSREEVNEKLRDTPDGTFLVRDASSKI 

QGEYTLTLRKGGNNKLIKVFHRDGHYGFSEPLTF 

CS VVDLINHYRHESLAQYNAKLDTRLLYPVSKY 

QQDQIVKEDSVEAVGAQLKVYHQQYQDKSREY 

DQLYEEYTRTSQELQMKRTAIEAFNETEKIFEEQG 

QTQEKCSKEYLERFRREGN/QTKEMQRILLNSER 

LKSR1A\EIHESRTVKL\EQQLLVPRASDNKRD/IDK 

PH*TSLKPDLMQLRKIRDQYLVWLTQKGARQKK 

INEWLGIKNETEDQYALMEDEDDLPHHEERTWY 

VGKINRTQAEEMLSGKRDGTFLIRESSQRGCYAC 

SVWDGDTKHCVIYRTATGFGFAEPYNLYGSLK 

ELVLHYQHASLVQHNDALTVTLAHPVRAPGPGP 

PPAAR 


.3385 


A 


43 


2372 


TRDVNSWKELCFNHYNKETTNCYRTTRKWTNY 

KHFLGPFRELRSQGNQVILNLGKERCQLRETGLK 

LYLPGMDSARHHISHSTSAGPIPSQKEEEMTESQ 

GTVTFKDVAIDFTQEEWKRLDPAQRKLYRNVML 

*NYNNLITVGYPFTKPDVIFKLEQEEKPWVMEEE 

VLRRHWQGEIWGVDEHQKNQDRLLRQVEVKFQ 

KTLTEEKGNECQKKFANVFPLNSDFFPSRHNLYE 

YDLFGKCLEHNFDCH^?NVKCLMRKEHCEYNEP 

VKSYGNSSSHFVITPFKCNHCGKGFNQTLDLIRH 

LRfflTGEKPYECSNCRKAFSHKEKLIKHYKIHSRE 

QSYKCNECGKAFIKMSNLIRHQRIHTGEKPYACK 

ECEKSFSQKSNLIDHEKIHTGEKPYECNECGKAFS 

QKQSLIAHQKVHTGEKPYACNECGKAFPR1ASLA 

LHMRSHTGEKPYKCDKCGKAFSQFSMLIIHVRIH 

TGEKPYECNECGKAFSQSSALTVHMRSHTGEKP 

YECKECRKAFSHKKNFITHQKIHTOEKPYECNEC 

GKAFIQMSNLVRHQRIHTGEKPYICKECGKAFSQ 

KSNLIAHEKIHSGEKPYECNECGKAFSQKQNFIT 

HQKVHTGEKPYDCNECGKAFSQIASLTLHLRSHT 

GEKPYECDKCGKAFSQCSLLNLHMRSHTGEKPY 

VC^JECGKAFSQRTrXIVHMRGHTGEKPYECNEC 

GKAFSQSSSLTIHIRGHTGEKPYEGKECRKAFSHIC 

KNFITHQKIHTRE/KPFKCNHCGKGFNQTLDLIRH 

LRfflTGEBCPYECSNCRKAFSHKEKLIKHYKIHSRE 

QSYKCNECGKAFIKMSNLIRHQRIHTGEKPYACK 

ECEKSFSQKSNLIDHEKIHTGEKPYECNECGKAFS 

QKQSLIAHQKVHTGEKPYACNECGKAFPR1ASLA 

LHMRSHTGEKPYKCDKCGKAFSQFSMLIIHVRIH 

TGEKPYECNECGKAFSQSSALTVHMRSHTGEKP 

YECKECRKAFSHKXNFITHQKJHTREKPYECNEC 

GKAFIQMSNLVRHQRIHTGEKPY1CKECGKAFSQ 

KSNLIAHEKIHSGEKPYECNECGKAFSQKQNFIT 

HQKVHTGEKPYDCNECGKAFSQIASLTLHLRSHT 

GEKPYECDKCGKAFSQCSLLNLHMRSHTGEKPY 

VCNECGKAFSQRTFLIVHMRGHTGEKPYECNEC 

GKAFSQSSSLTIHIRGHTGEKPYECKECRKAFSHK 

KOTITHQKIHTRENPLSVIR^EKASIRLWTSSDI 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location - 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F^Phenyl alanine, G*=G lycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M^Methionine, 
N=Asparagine,P«Proline, Q=Clutamine, R^Arginine, S^Serine, 
T=Thrconine, V«=VaIine, AV^Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 


3386 


A 


201 


1032 


AVDD^TQGALRRREAAEGLHFLGPPGRVRGQLR 

GITGPAWYCHSPSHSLLSAFCHLPTPSRCPAMAR 

PPVPGSVWPNWHES/RRGQGVPGLHSAQEPPAG 

VWAA*AASAAAA\LSEDTASYKIFVSGKSGVGKT 

ALVAKLAGLEWVVHHETTGIQTTVVFWPAKLQ 

ASSRVVMFRFEFWDCGESALKXFDHMLLACME 

NTDAFLFLFSFTDRASFEDLPGQLARIAGEAPGV 

VRMVIGSKFDQYMHTDVPERDLTAFRQAWELPL 

LRVKSVPGRRLG 


3387 


A 


86 


96 


GSSPDPASLITMKNQDKKNGAAKQSNPKSSPGQP 

EAGPEGAQERPSQAAPAVEAEGPGSSQAPRKPEG 

AQARTAQSGALRDVSEELSRQLEDILSTYCVDNN 

QGGPGEDGAQGEPAEPEDAEKSRTYVARNGEPE 

PTPWNGEKEPSKGDPNTEEIRQSDEVGDRDHRR 

PQEKKKAKGLGKEITLLMQTLNTLSTPEEKLAAL 

CKKYAELLEEHRNSQKQMKLLQKKQSQLVQEK 

DHLRGEHSKAVLARSKLESLCRELQRHNRSLKE 

EGVQRAREEEEKPIKEVTSHFQVTLNDIQLQMEQ 

HNERNSKLRQENMELAERLKKLIEQYELREEHID 

KVFKHKDLQQQLVDAKLQQAQEMLKEAEERHQ 

REKDFLLKEAVESQRMCELMKQQETHLKQQLA 

LYTEKFEEFQNTLSKSSEVFTTFKQEMEKK1TKKI 

KKLEKETTMYRSRWESSNKALLEMAEEKTVRD 

KELEGLQVKIQRLEKLCRALQT/GAQ*P VRGQRW 

G SHRTS A VRIFS 


3388 


A 


98 


3197 


ARPE VPAPPA WLSRRG A AKMGDKKDDKD S PKK 

NKGKERRDLDDLKKEVAMTEHKMSVEEVCRKY 

NTDCVQGLTOSKAQEILARDGPNALTPPPTTPEW 

VKFCRQLFGfGFSILLWIGAILCFLAYGIQAGTEDD 

PSGDNLYLGIVLAAVVnTGCFSYYQEAKSSKIME 

SFKNMVPQQALVIREGEKMQVNAEEVVVGDLV 

EIKGGDRVPADLRIISAHGCKVDNSSLTGESEPQT 

RSPDCTHE\NPLKTRNITFFSNNFVEGTARGVVVA 

TGDRTVMGRIATLASGLEVGKTPIAIEffiHFIQLIT 

GVAVFLGVSFFILSLILGYTWLEAVIFLIGirVANV 

PEGLLAWTVCLTLTAKRMARKKCLVKNLEAVE 

TLGSTSTICSDKTGTLTQNRMTVAHMWFDNQIH 

EADTTEDQSGTSFDKSSHTWVALF*H/LLGFCNR 

PVFKGGQDNTPVLKRDVAGDASES ALLKCEELSS 

GSVKLMRERNKKVAEIPFNSTOKYQLSIHETEDP 

NDNRYLLVMKGAPERILDRCSTILLQGKEQPLDE 

EMKEAFQNAYLELGGLGERVLGFCHYYLPEEQF 

PKGFAFDCDDVNFTTDNLCFVGLMSMGPPRAA 

VPDAVGKCRSAGIKVIMVTGDHPITAKAIAKGV 

GIIFEGNETVEDIAARLNIPVSQVNPRDAKACVIH 

GTDLKDFTSEQIDEILQNHTEIVFARTSPQQKLnV 

EGCQRQGAIVAVTGDGVNDSPALKKADIGVAM 

GIAGSDVSKQAADMILLDDNFASrVTGVEEGRLI 

FDNLKKSIAYTLTSNIPEITPFLLFIMANIPLPLGTI 

T1LCIDLGTDMVPAISLAYEAAESDIMKRQPRNPR 

TDKLVNERLISMAYGQIGMIQALGGFFSYFVILA 

ENGFLPGNLVGIRLNWDDRTVNDLEDSYGQQW 

TYEQRKWEFTCHTAFFVSIVWQWADL1ICKTR 

RNSVFQQGMKNKJDLIFGLFEETALAAFLSYCPGM 

DVALRMYPLKPSWWFCAFPYSFLIFVYDEIRKLI 
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SEQO) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine OCystcine, D=Aspartic Acid, 
E=Glutamic Acid, F*=Phenylalanine, G=Glycine, H=Histidine, 
I— Isoleucine, K=Lysinc, L=Leucine, M^Mcthionine, 
N=Asparagine, P=ProIine, Q^Giutamine, R=Arginine, S=Scrine, 
T=Threonine, V^Valine, W=Tryptophan, Y-Tyrosine, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
\=possifale nucleotide insertion 










LRRNPGGWVEKETYY 


3389 


A 


45 


5250 


VERLLGCRNSKRTWRMLISKNMPWRRLQGISFG 
MYSAEELKKLSVKSITNPRYLDSLGNPSANGLYD 
. LALGPADSKEVCSTCVQDFSNCSGHLGHIELPLT 
VYNPLLFDKLYLLLRGSCLNCHMLTCPRAVIHLL 
LCQLRVLEVGALQAVYELERILNRFLEENPDPSA 
SEIREELEQYTTEIVQNNLLGSQGAHVKNVCESK 
SKLIALFWKAHMNAKRCPHCKTGRSVVRKEHNS 
KLTITFPAMVHRTAGQKDSEPLGIEEAQIGKRGY 
LTPTSAJ^HLSALWKNEGFFLNYLFSGMDDDGM' 
ESRFNPSVFFLDFLVVPPSRYRPVSRLGDQMFTN 
GQTVNLQAVMKI>VVLIRJCLLALMAQEQKLPEE 
VATPTTDEEKDSLIAIDRSFLSTLPGQSLIDKLYNI 
WIRLQSHVNIVFDSEMDKLMMDKYPGIRQILEK 
KEGLrTOCHMMGKRVDYAARSVlCPDMYINTNEI 
GIPMWATKLTYPQPVTPWNVQELRQAVINGPN 
VHPGASMVINEDGSRTALSAVDMTQREAVAKQ 
LLTPATGAPKPQGTKIVCRHVKNGDILLLNRQPT 
LHRPSIQAHRARILPEEKVLRLHYANCKAYNADF 
DGDEMNAHFPQSELGRAEAYVLACTDQQYLVP 
KDGQPLAGLIQDHMVSGASMTTRGCFFTREHYM 
ELVYfeGLTDKVGRVKLLSPSILKPFPLWTGKQVV 
STLLINnPEDHIPLNLSGKAKITGKAWVKETPRSV 
PGFNPDSMCESQVIIREGELLCGVLDKAHYGSSA 
YGLVHCCYEIYGGETSGKVLTCLARLFTAYLQL 
YRGFTLGVEDILVKPKADVBCRQRJIEESTHCGPQ 
AVRAALNLPEAASYDEVRGKWQDAHLGKQQRD 
FNMIDLKFKEEVNHYSNEINKACMPFGLHRQFPE 
NTLQLMVQSGAKGSTVNTMQISCLLGQIELEGRS 
TPLMASGKSLPCFEPYEFTPRAGGFVTGRFLTGIK 
PPEFFFHCMAGREGLVDTAVKTSRSGYLQRCIIK 
HLEGL WQ YDLTVRDSDG S V VQFLYGEDGLDIP 
KTQFLQPKQFPFLASNYEVIMKSQHLHEVLSRAD 
PKKALHHFRAIKKWQSIOiPNTLLRRGAFLSYSQ 
KIQEAVKALKLESENRNGR/RPWDS/G/RMLRMW 
YELDEESRRKYQKKAAACPDPSLSVWRPDIYFAS 
VSETFETKVDDYSQEWAAQTEKSYEKSELSLDR 
LRTLLQUKWQRSLCEPGEAVGLLAAQSIGEPST 
QMTLNTFHFAGRGEMhJVTLGEPRLREILMVASA 
MKTPMMSVPVLNTKKALKRVKSLKKQLTRVCL 
GEVLQKIDVQESFCMEEKQNKFQVYQLRFQFLP 
rL^YYQQEKCLRPEDILRFMETRFFKLLMESIKKK 
NNKASAFRKVNTRRATQRDLDNAGELGRSRGE 
QEGDEEEEGHIVDAEAEEGDADASDAKRKEKQE 
EEVDYESEEEEEREGEENDDEDMQEERNPHREG 
ARKTQEQDEEVGL/GH* GGPVPSRPPDAAPETHP 
QPGAPGA\EAMERRVQAVREIHPFIDDYQYDTEE 
SLWCQVTVKiPLMKINFDMSSLVVSLAHGAVIY 
ATKGITRCLLNETTNNKNEKELVLNTEGINLPELF 
KYAEVLDLRRLYSNDIHAIANTYGIEAALRVIEK 
EIKDVFAVYGIAVDPRHLSLVADYMCFEGVYKP 
LNRFGIRSNSSPLQQMTFETSFQFLKQATMLGSH 
DELRSPSACLVVGKVVRGGTGLFELKQPLR 


3390 


A 


2 


2080 


ILPPLEGPPAQASPSSTMLGEGSQPDWPGGSRYD 
■ LDEIDAYWLELINSELKEMERPELDELTLERVLE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F«=Phcnylalanine, G=Glycine, H^Histidine, 
I— Isoleucine, lO^Lysine, L^Leucinc, M— Methionine, 
N*=Asparagine, P=Proline, Q=Glutamine, R=Argininc, S^Serine, 
T«Threonine, V«Valine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 










ELETLCHQNMARAIETQEGLGIEYDEDWCDVC 

RSPEGEDGNEMVFCDKCNVCVHQACYGILKVPT 

GSWLCRTCALGVQPKCLLCPKRGGALKPTRSGT 

KWV^SCALWIPEVSIGCPEKMEPITKJSHIPASR 

>VALSCSLCKECTGTCIQCSMPSC\VTAFHVTCAF 

DHGLEMRTJLADNDEVKFKSFCQEHSDGGPRNE 

PTSEPTEPSQAGEDLEKVTLRKQRLQQLEEDFYE 

LVEPAEVAERLDLAEALVDFIYQYWKLKRKANA 

NQPLLTPKTDEVDNLAQQEQDVLYRRLKLFTHL 

RQDLERVRl^CYMVTRRERTKHAICKLQEQIFH 

LQMKLIEQDLCRAGLSTSFPIDGTFFNSWLAQSV 

QJTAENMAMSEWPLNNGHREDPAPGLLSEELLQ 

DEETLLSFMRDPSLRPGDPARKARGRTRLPAKK 

KPPPPPPQDGPGSRTTPDKAPKKTWGQDAGSGK 

GGQGPPTRKPPRRTSSHLPSSPAAGDCPILATPES 

PPPLAPETPDEAASVAADSDVQVPNGPAASPKPLG 

RLRPPPREPR* T\RRLPGC/ARPD AGDGDHLSA V A 

ERPKV\SLHFDTETDG\YFS\DGEMSNS\DV\EAED 

GGVQRGPREAGAKEWVRMGVLAS 


3391 


A 


1555 


327 


NSFLHFLHLKVRTMFLFPSFPVLLLSVVTASCSKT 

KACADTQKTCSMTTCGIPVTNGTPGRDGRDRPK 

GEKGEPGLGQVSVAS*ISTSGRCSSKSVLEPATRG 

LKHRLGEAPLSSGPMLHSEQPL*NAIASKTKLFV 

DSLGSHISTQELGVCGCPFRGVSCLVGELALVQA 

LH*VAGESFFFGSDHWLIGCAGGEQEWSIELLGK 

KKRVTATGSSSLCLATGQGLRGLQGPPGKMGPP 

GNTGTSGIPGPRGQKGDRGDNSVAEAKLANLER 

KL*SLRSELDHTKKL*PFSLGK\MSGKKLFVTNGE 

RMPFSKVKALCAGLQATVAAPKNAEENKAIQDV 

AKDTAFLGITDEATEGQFMYLTGGRLTYSNWKX 

DEPNDHGSGEDCVILLNNGL WNGISCTSSFIAICE 

FPA 


3392 


A 


218 


1773 


GGSRRNQRRSIPVLGYFLKQKKMTKAQESLTLE 

DVAVDFTWEEWQFLSPAQKDLYRDVMLENYSN 

LVSVGYQAGKPDALTKLEQGEPLW1LEDEIHSP 

AHPEIEKADDHLQQPLQNQKILKRTGQRYEHGR 

TLKSYLGLTNQSRRYNRICEPAEFNGDGAFLHDN 

HEQMPTEIEFPESRXPISTKSQFLKHQQTHNIEKA 

HECTDCGKAFLKKSQLTEHKRIHTGKXPHVCSL 

CGKAFYKKYRLTEHERAHRGEKPHGCSLCGKAF 

YKRYRLTEHERAHKGEKPYGCSEGGKAFPRKSE 

LTEHQRIHTGIKPHQCSECGRAFSRJCSLLVVHQR 

THTGEKPHTCSECGKGFIQKGNLNIHQRTHTGEK 

PYGCEDCGKAFSQKSCLVAHQRYHTGKTPFVCPE 

CGQPCSQKSGLIRHQKIHSGEKPYKCSDCGKAFL 

TKTMLIVHHRTHTGERPYGCDECEKAYFYMSCL 

VKrIKRIHSREKRGD/CSEGGKSFHSKSQLKS**TC 

AGEKPC*YGNCGNGGRAV 


3393 


A 


46 


1464 


ARSLSGAPSGSSRQDGTSLLRTGAGYSSSQSIETL 

SLPPGPSHLVGDKSQGGRSCQGQITSAASGKTSK 

SEPNHVIFKKJSRDKSV7MYLGNRDY\IDHV\SQV 

QPVDGWLVDPDLVKGKKVYVTLTCAFRYGQE 

DIDVIGLTFRRDLYFSRVQVYPPVGAASTPTKLQ 

ESLLKKLGSNTYPFLLTFPDYLPCSVMLQPAPQD 

SGKSCGVDFEVKAFATDSTDAEEDKIPKKSSVRL 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D^Aspartic Acid, 
E=Glutamic Acid, F=Phenylala nine, G=GIycine, H=Histidine, 
I=Isoleucine, K^Lysine, L^OLeucine, M—Methionine, 
N^Asparagine, P=Proline, Q=Glutamine, R=Argininc, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X*=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










LIRKVQHAPLEMGPQPRAEAAWQFFMF\DKPLH 

LA VSLNKRDLFPMG SPIP VPVS VP\NNTEKPVKK1 

KA\SVEQVANWLYS\SDY\YVKPVAMEEAQEKV 

PPNSTWTKA\LTLL\PWLVNNRERRGIALDGKI^ 

EDTNLASSTIIKEGIDRKRSWEILVSYPDQR*SSTV 

SGFLGRASPSQ*SRPT*RSQFRL\MHPQP\EDPA\K 

ESYQDAM,VF\EEFARP*ILKDAGEA*\EGKRDQE 


3394 


A 


211 


1591 


RPPTMAADQRPKADTLALRQRLISSSCRLFFPEDP 

VKIVRAQGQYMYDEQGAEYIDCISNVAHVGHCH 

PLVVQAAHEQNQVLNTNSRYLHDNIVDYAQRLS 

ETLPEQLCVFYFLNSGSEANDLALRLARHYTGH 

QDVWLDHAYHGHLSSLIDISPYKFRNLDGQKE 

WVHVAPLPDTYRGPYREDHP\THVEDGLEKAFS* 

KRVVQGRNRQICRRQIAAFFAESLPSVGGQIIPPA 

GYFSQVAEHIRKAGGVFVADEIQVGFGRVGKHF 

WAFQLQGKDFVPDIVTMGKSIGNGHPVACVAAT 

QPVARAFEATGVEYFNTFGGSPVSCAVGLAVLN 

VLEKEQLQDHATSVGSFLMQLLGQQKIKHPIVG 

DVRGVGLFIGVDLIKDEATRTPATEEAAYLVSRL 

PCENYVLLSTDGPGRMLKFKPPMCFSLDNARQV 

VAKLDAILTDMEEKVRSCETLRLQP 


3395 


A 


1 


1424 


FRDGFSLRCGCNAELPGRGGDDAADRAIQRFLR 

TGAAVRYKVMKNWGVIGGIAAALAAGIYVTWG 

PITERKKRRKGLVPGLVNLGNTCFMNSLLQGLSA 

CPAFIRWLEEFTSQYSRDQKEPPSHQYLSLTLLHL 

LKALSCQEVTDDEVLHASCLLDVLRMYRWQISS 

FEEQDAHELFHVTTSSLEDERDRQPRVTHLFDVH 

SLE\HSQK*LPKQITCRTRGSPHPTSNHWKSQHPF 

HGRLTSNMVCKHCEHQSPVRFDTFDSLSLSIPAA 

TWGHPLTLDHCLHHFISSESVRDVVCDNCTKIEA 

kgtlngekvehqrttfvkqlklgklpqclcihl 
qrlswsshgtplkrhertvqfneflmmdiykyhl 
lghkpsqhnpklnknpgptlelOdgpgaptpgl 
nqpgapktqifmngacspsllptlsapmpfplpv 

VPDYSSSTYLFRLMGSCRPPWETWHSGTLCSFTD 
GPHL 


3396 


A 


109 


107 


TQEAGLIFFSPPFSLSLSLSLPLSLFLLSHPHSRTPP 

NRTPRRTRIPQRPAVMYSPLCLTQDEFHPFIEALL 

PHVRAFAYTWFNLQARKRKYFKXHEI<OlM 

ERAVKDELLSEKPEVKQKWASRLLAKLRKDIRP 

EYREDFVLTVTGKKPPCCVLSNPDQKGKMRRID 

CLRQADKVWRLDLVMVELFKGIPLESTDGERLV 

KSPQCSNPGLCVQPHfflGVSVKELDLYLAYFVH 

AADSSQSESPSQAK*R*H*GPARKWDIWGFQ\DS 

FVT\SGVF\SVT*A*LRVSQTPI\AAG\TGPNFSLSD 

LESSSYYSMSPGAMRRSLPSTSSTSSTItRLKSVED 

EMDSPGEEPFYTGQGRSPGSGSQSSGWHEVEPG 

MPSPTTLKKSEKSGFSSPSPSQTSSLG\TAFTQHHR 

PVITGTQSKFHIATPSIL\HFPRHSPFFQQPGPYFSH 

PAIRYHPQETLKEFVQLVCPD AGQQAGQPNG SS 

QGKVHNPFLPTPMLPPPPPPPMARPVPLPVPDTK 

PPTTSTEGGAASPTSPTTRS/PGRTOPQQPFL/SYG 

PP*PSNALIGGGGGGAGERAGERADLEM 


3397 . 


A 


1 . 


2002 


TGTLTEDGLDVMGVVPLKGQAFLPLVPEPRRLP 
VGPLLRALATCHALSRLQDTPVGDPMDLKMVES 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phcnyl alanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysinc, L=Leucine, M^Methionine, 
. N=Asparagine, P=Proline, Q^GIutamine, R=Argininc, S=Serine, 
T=Threonine, V=Va!ine, W=Tryptopnan, V«Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
V=possible nucleotide insertion 










TGWVLEEEPAADSAFGTQVLAVMRPPLWEPQLQ 

AMEEPPVPVSVLHRFPFSSALQRMSWVAWPGA 

TQPEAYVKGSPELVAGLCNPETVPTDFAQMLQS 

YTAAGYRVVALASKPLPSVPSLEAAQQLTRDTV 

EGDLSLLGLLVMRNLLKPQTTPVIQALRRTRIRA 

VMVTGDNLQTAVTVAE.GCGMVAPQEHLITVHA 

THPERGQPASLEFLPMESPTAVNGVKDPDQAAS 

YTVEPDPRSRHLALSGPTFGIIVKHFPKLLPKVLV 

QGTVFARMAPEQKTELVCELQKLQYCVGMCGD 

GANDCGALKAADVGISLSQAEASWSPFTSSMA 

SIECVPMVIREGRCSLDTSFSVFKYMALYSLTQFI 

SVLILYTINTNLGDLQFLAroLWTTTVAVLMSRT 

GPALVLGRVRPPGALLSVPVLSSLLLQN4VLVTG 

VQLGGYFLTLAQPWFVPLNRTVAAPDNLPNYEN 

TVVFSLSSFQYLILAAAVSKGAPFR\RPLTNNVPF 

LLASAL*SSVLVVLVLSPGLLHGPLALRNITDTGF 

KLLLVGLVTLNFVGGLHAGERARPVPPRLPAPPP 

AQAG\SKKRFKQLERELAEQPWPPLPAGPLR 


3398 


A 


758 


1368 


FPFRMLTGYLYLMWRRKAFWSGTQRHPLPGGL 

KRRRRPGRGPWPAPGGQGVGPSAL*KAGSPPAN 

RPGQGE/PGLISPKPVTEVLPDVQGAPVPVPPLPT 

PPSLPHLQNQPP/TVQHYLLSFSWKPSQGPE*RA* 

PSPLPPAAMRPDG*PGPASQGPDQPG\PCPPASLP 

TSPPGKGFQKTETRKHPPPRQQHKPKCTANRPLA 

SFL . - 


3399 


A 


906 


1091 


HHrmHHHHHHHHHLVAFGKVQ*LQNSPSSSSSS " 
SSGCFWQARFSSYRTLHHHHHHHHHHHHH 


3400 


A 


1838 


325 . 


PFLSVHRSPHGPSKLCDDPQASLVPEPVPGGCQE 

PEEMSWPPSGE1ASPPELPSSPPPGLPEVAPDATST 

GLPDTPAAPETSTNYPVECTEGSAGPQSLPLPILE 

PVKNPCSVKDQTPLQLSVEDTTSPNTIO^CPPTPTT 

PETSPPPPPPPPSSTPCSAHLTPSSLFPSSLESSSEQ 

KFYNFV1LHARADEHIALRVSGRSWEALGVPDG 

ATFCEDFQVPGRGELSCLQDAIDHSAFIILLLT\SN 

\FDCR\LSLHQVNQAMMSNLI\RQGSQDCVIP\FLP 

\LESSPARLSSDTASLLSGLVRLDEHSQIFARKVA 

NTFKPHRLQARKAMWRKEQDTRALREQSQHLD 

GERMQAAALNAAYSAYLQSYLSYQAQMEQLQV 

AFG SHMSFGTG APYG ARMPFGGQVPLG APPPFP 

TWPGCPQPPPLHAWQAGTPPPPSPQPAAFPQSLP 

FPAVPKPFPTASTAPPSEPKGWQP\LIIHHAQMVT 

SWG*NKH\MAVNQRGSQAPEDKTQEAE 


3401 


A 


153 


1389 


E WG WLG AAQPPEEEAEAEDQESPSSLCRE ALAEI 

KKEISPLFIGMEKCSVGGLELTEQTPALLGNMAM 

ATSLMD1GDSFGHPACPLVSRSRNSPVEDDDDDD 

DVVFIESIQPPSISAPAIADQRNFIFASSKNEKPQG 

NYSVIPPSSRDLASQKGNISEHVIDDEEDIETNGG 

AEKKSSCFIEWGLPGTKNKTODLDFSTSSLSRSK 

WAGMGNSGITTELTLKYnTNVTTLETGISSVNA 

GQDVNniTYKTSL*NTNLGDVAKGLQSSNFGVNI 

Kill Iroi^ir^livJLuVVNLLlJLVb MWt^Jbl YrRME 
NLQLII/CPEDASTKKANVILPVESSKSFQEFYSTS 
CLSPCENNWNLKKGVFh^SRCTICSKLAEVWIFI 
PKLLFRLTVIILTFKCYYVLFHLHNARVLDV 


3402 


A 


153 


1389 


EWGWLGAAQPPEEEAEAEDQESPSSLCREALAEr 
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SEQID 
NO: 


Method 


Predicted 
beginning 

nucicuiiuc * 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 

1ULULIUU 

corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteinc, D^Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycinc, H=Histidine, 

J. X3UICUC1I1C) £Yr~JUy S 1 11 Cy U^l^CUCIflC, J>1~ IrlClJllUIlIIlC) 

N=Asparaginc, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Va1ine, W=Tryptophan, Y^Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










KKEISPLFIGMEKCSVGGLELTEQTPALLGNMAM 

ATSLMDIGDSFGHPACPLVSRSRNSPVEDDDDDD 

DVVFIESIQPPSISAPAIADQR2VTFIFASSKNEKPQG 

NYSVIPPSSRDLASQKGNISETIVEDDEEDIETNGG 

AEKKSSCHEWGLPGTKNKTNDLDFSTSSLSRSK 

VNAGMGNSGITTELTLKYETNVTTLETGISSVNA 

GQDVNinTYKTSL*NTNLGDVAKGLQSSNFGVNI 

QTYTPSLTPQTKTGV\NLLTLVE*MWQETYFRME 

NLQLiyCPEDASTKKANVILPVESSKSFQEFYSJS 

CLSPCEN>TWNLKKGVFNKSRCTICSKLAEVWIFI 

PKLLFRLTVI1LTFKCYYVLFHLHNARVLDV . 


3403 


A 


609 


2765 


SRHCTPAERQNETHRAPDFAMSAVLGHQPPFFPA 

LTLPPNGAAALSLPGALAKPIMDQLVGAAETGIP 

FSSLGPQAHLRPLKTMEPEEEVEDDPKVHLEAKE 

LWDQFHKRGTEMVITKSGRRMFPPFKVRCSGLD 

KKAKYILLMDIIAADDCRYKFHNSRWMVAGKA 

DPEMPKRMYIHPDSPATGEQWMSKVVTFHKLKL 

TNNISDKHGFTILNSMHKYQPRFHIVRANDILKLP 

YSTFRTYLFPETEFIAVTAYQNDKITQLKIDNNPF 

AKGFRDTGNGRREKRKQLTLQSMRVFDERHKK 

ENGTSDESSSEQAAFNCFA\QASSPAA\PL*RTSNL 

KDFVSPSRG * RATPEAEEQRG STAPRPATRAKISP 

HPRRRSPAVTRAAPAVKAHLFAAERPRDSGRLD 

KASPDSRHSPATISSSTRGLGAEERRSPVREG\QA 

PAKVEEARALPGKEAFAPLTVQTDAAAAHLAQG 

PLPGLGFAPGLAGQQFFNGHPLFLHPSQFAMGG 

AFSSMAAAGMGPLLATVSGASTGVSGLDSTAM 

ASAAAAQGLSGASAATLPFHLQQHVLASQGLA 

MSPFGSLFPYPYTYMAAAAAA/SSAAASASVHRT 

P\E>JLNTMRPRLRYSP YSIPVPVPDGS SLLTTALPS 

MAAAAGPLDGKAAALAASPASWAVDSGSELNS 

RSS\TLSSSSMSLSPKLCAEKEAATSELQSIQRLVS 

GLEAKPDRSRSASP 


3404 


A 


1082 


1308 


LKKFLEVPQSYSLLLSSPFLQ\WRA*RPQNAIG*Q 
■ FIIKTLVFFGIMRSAGDVLSTQVSCALRIMRTAGC 
SHSSP 


.3405 


A 


1553 


559 


. PRPPTQRLSRFAPPCRTAEFPFRRRAWTRPAPPR 
ACTVVGRSSPVTGLAVGAAVAMLTVAARSRPFA 
PVLSATSRGVAGALT\P*MQATVPATPEQPVLDL 
KRPFLSRESLSGQAVRRPLVASVGLNVPASVCYS 
HTDIKVPDFSEYRRLEVLDSTKSSRESSEARKGFS 
YLVTGVTTVGVAYAAKNAVTQFVSSMSASADV 
LALAKIEIKLSDIPEGK^^vtAFKWRGKI > LFVRHRT 
QKEIEQEAAVELSQLRDPQHDLDRVKKPEWVILI 
GVCTHLGCVPIANAGDFGGYYCPCHGSHYDASG 
RIRLGPAPLNLEVPTYEFTSDDMVIVG 


3406 


A 


83 


2671 . 


CLYPDFCRSVTCAMPCFTHRSCREDPGTSESREM 

DPVAFKDVAVNFTQEEWALLDISQKNLYREVML 

ETFWNLTSIGKKWKDQNIEYEYQNPRRNFRSVT 

EEKVNEIKEDSHCGETFTPVPDDRLNFQKICKASP 

EVKSCDSFVCEVGLGNSSSNMNIRGDTGHKACE 

CQEYGPKPWKSQQPKKAFRYHPSLRTQERDHTG 

KICPYACKECGKNnYHSSIQRHMVVHSGDGPYK 

CKFCGKAFHWLSLYLIHERTHTGEKPYECKQCG 

KSFSYSATHRIHERTfflGEKPYECQECGKAFHSPR 
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SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PbenylaIanine, G=Glycinc, H=Histidine, 
I^Tlftlpiirinp ffsT vcinp T .iMirini*. TVf=lVTpfhinnini» 

N*=Asparagine, P=Proline, Q=Glutamine, R=Argininc, S^erine, 
T=Threonine, V«Valine, W=Tryptophan, Y=Tyrosine, 
. X=Un known, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










SCHRHERSHMGEKAYQCKECGKAFMCPRYVRR 

HERTHSRXKLYECKQCGKALSSLTSFQTHIRMHS 

GERPYECKTCGKGFYSAKSFQRHEKTHSGEKPY 

KCKQCGKAFTRSGSFRYHERTHTGEKPYECKQC 

GKAFRSAPNLQSHGRTHTGEKPYECKECGKAFIF 

VNNLQSHERTQTHIRIHSGERRYKCKICGKGFYC 

PKSFQRHEKTHTGEKLYEC/TATFSSSFSSSSSF*Y 

HERTHTGEKPYKCEQCGKAFRAVSIL*MHGRTH 

PEEKPYECEQ*RKAFRSAPHL*IRGRTrINGEKPY 

ACKKCGKPFGSAQNLRIHERTQTHIMHSVERPYK 

CKICGRGFYSAKSFQTHEKSYTGEKPYECKQCG 

KAFVSFTSFRYHERTHTGENPYECKQFGKAFRSV 

KNLRFHKRTHTGEKPCEYMKRLTLEGNTMNAS 

NVAKLSLLPVLFMWKEFTLGRNPISVSNVRKPLF 

LPLLFMMKGLTWERNPMSVCHVGKPSFLLVPFN 

IMKGLTLERSPMNISNVGKPSDQPRTFKCMEGLT 

LEKNPMNVSSMGKRSDLTRFFEYR 


3407 


A 


1426 


3 


PAAPSGASPGRVCGVETARPLGVQRRQSADEGP 
PGVAGLRHEPPTVWLGS VAHRGTWVCAHRWFG 
PAVTRAAQAATMVKLLVAKILCMVGVFFFMLL 
GSLLPVKIIETDFEKAHRSKKILSLCNTFGGGVFL 
ATC\LTALLARC*GKSSRRSWSLGHISTDYPL\AE 
TTLLLGFFMTVFLEQLILTFAQENAVLHRPGDLQR 
RIGRGQRLGV*EPLHGGRAGPRAVRGAPRPRPQP 
. ERAGPLA\PSPVRJLLSLAFALSAHSVFEGLALGLQ 
EEGEKVVSLFVGVAVHETLVPVALGISMAGSAM 
PLRDAAKLAVTVSPMIPLGIGLGLGIEKAQGVPG 
SVASVLLQGPGGRHLSLFITFPGKS WPRSWRKKS 
DRLLKVLFOLVVGYTVLAGMGLPQVVSGLAIVPA 
AGSPPGAPGRTQAASPGRASPKSEHCGPGPPPVH 
KGPPGTRLCPRSYTLSLRALLLFKILLSLKSLYQK 
KK 


3408 


A 


106 


4514 


EARDRLAQSRAKEKELNSVASELSARQEESEHSH 

KHLIELRREFKIQWPEEIREMVAPVLKSFQAEVV 

ALSKRSQEAEAAFLSVYKQLIEAPALWELKLKSR 

PALGDSRVQQGQHDPKTDNQNTQQKAGFKEGW 

LAEASEREAFGPGFKDPVPVFEAARSLDDRLQPP 

SFDPSGQPRRDLHTSWKRNPELLSPKALKATQAE 

LLELRRKYDEEAASKADEVGLIMTNLEKANQRA 

EAAQREVESLREQLASVNSSIRJLACCSPQGPSGD 

KVNFTLCSGPRLEAALASKDREILRLLKDVQHLQ 

SSLQELEEASANQIADLERQLTAKSEAIEKLEEKL 

QAQSDYEEIKTELSILKAMKLASSTCSLPQGMAK 

PEDSLLIAKEAFFPTQKFLLEKPSLLASPEEDPSED 

DSIKDSLGTEQSYPSPQQLPPPPGPEDPLSPSPGQP 

LLGPSLGPDGTRTFSLSPFPSLASGERLMMPPAAF - 

KGEAGGLLVEPPAFYGAKPPTAPATPAPGPEPLG 

GPEPADGGGGGAAGPGAEEEQLDTAEIAFQVKE 

QLLKHNIGQRVFGHYVLGLSQGSVSEILARPKP\ 

WRKLHG* *GKEPFIKMKQFLSDEQNVL ALRTIQV 

RQRGSITPRIRTPETGSDDAIKSILEQAKKEIESQK 

GGEPKTSVAPLSIANGTTPASTSEDAIKSILEQAR 

REMQAQQQALLEMEVAPRGRSVPPSPPERPSLAT 

ASQNGAPALVKQEEGSGGPAQAPLPVLSPAAFV 

QSIIRKVKSEIGDAGYFDHHWASDRGLLSRPYAS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding. 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=PbcnyIalanine, G=Glycine, H=Htstidine, 
I—Isoleucine, K— Lysine, L^Leucine, M— Methionine, 
N=Asparagine, P=ProIine, Q-Glutaminc, R-Arginine, S^Serine, 
T-Threonine, V=Valine, W«Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 










VSPSLSSSSSSGYSGQPNGRAWPRGDEAPVPPED 

EAAAGAEDEPPRTGELKAEGATAEAGARLPYYP 

AYWRTLKPTVPPLTPEQYELYMYREVDTLELTR 

QVKEKLAKNGICQRIFGEKVLGLSQGSVSDMLSR 

PKPWSKLTQKGREPFIRMQLWLSDQLGQAVGQQ 

PGASQASPTEPRSSPSPPPSPTEPEKSSQEPLSLSLE 

SSKENQQPEGRSSSSLSGKMYSGSQAPGGIQEIV 

AMSPELDTYSITKRVKEVLTDNNLGQRLFGESIL : 

GLTQGSVSDLLSRPKPWHKLSLKGREPFVRMQL 

WLNDPHNVEKLRDMKKLEKKAYLKRRYGLIST 

GSDSESPATRSECPSPCLQPQDLSLLQIKKPRVVL 

APEEKEALRKAYQLEPYPSQQTffiLLSFQLNLKT 

NWINWFHNYRSRMRREMLVEGTQDEPDLDPSG 

GPGILPPGHSHPDPTPQSPDSETEDQKPTVKELEL 

QEGPEENSTPLTTQDKAQVRKQEQMEEDAEEE 

AGSQPQDSGELDKGQGPPKEEHPDPPGNDGLPK 

VAPGPLLPGGSTPDCPSLHPQQESEAGERLHPDP 

LSFKSASESSRCSLEVSLNSPSAASSPGLMMSVSP 

VPSSSAPISPSPPGAPPAKVPSASPTADMAGALHP 

SAKVNPNLQRRHEKMANLNNIIYRLERAANREE 

ALEWEF 


3409 


A 


162 


1710 


GPLSPGPYQCRPSLPAQLYPQSLMAAATLRTPTQ 

GTVTFEDVA VHFS WEEWGLLDEAQRCLYRD VM 

LENLALLTSLDVHHQKQHLGEKHFISNVGRALF 

VKTCTFHVSGEPSTCREVGKDFLAKLGFLHQQA 

AHTGEQSNSKSDGGAISHRGKTHYNWGEHTKAF 

SGKHTLVQQQRTLTTERCYICSECGKSFSKSYSL 

NDHWRLHTGEKPYECRECGKSFRQSSSLIQHRR 

GHTAVRPHECDECGKLFSNKSNLIKHRRVHTGE 

RPYECSECGKSFNQRSALLQHRGVHTGEKPYEC 

TECGKSFSHNSSL3KHQRIHSG*\RPYECTECGKSF 

SQNSSLIEHHRVHTGERPYKCSECGKSFRQRSAL 

LQHRGVPTGERPYECSECGKFFPYSSSLGKHQRV 

HTGSRPYECSECGKSFTQNSGLIKHRRVHTGEKP 

YECTE*KKSFSHNSSLIKHQRfflSR*KPYE\CKCG 

N\R*HPGESP*VHSECQ/KSFS*RPYLIECHTVHKG 

KTLLICRDVQLI 


3410 


A 


167 


789 


LCMKGISGGVRVAALAARAEREELPVPAMEPQP 

TAWGSPHPEAVLQLEVAPESSGPCTDTAKDQQS 

DKLPDLMPPA\EPLGSALELRASLEIDVAE\RGCE 

HGPSQQLPRCP*SWAWSEPWCQRPGCAV*APLP 

Y*REASFIYQSHSPAASGPFHSAGAGAVYLQAGG 

V/GEQEKEAVRKGSGSSSCSQRGP\PPPGMEVCPL 

LGFWAICP 


3411 


A 


1040 


887 


ASLSKPAGISTMPWALELLFLLTHSAVS WQAGL 

TQPPSVSKDLR\QTATLTCTGNSNNVGHQGVIWL 

QQHQGHPPKLLSYRNNNRPSGISERLSAYKSGNA 

ASLTIYGLQTEHEAD**CRPRRKLIPKTARLFFFFL 

EDNEEYLLRVY 


3412 


A 


164 


83 


RRGIPGSASLSLTMCVRSCFQSPRLQWVWRTAFL 

KHTQRRHQGSHRWTHLGGSTYRAVIFDMGGVLI 

PSPGRVAAEWEVQNRIPSGTILKALMEGGENGP 

WMRFMRAEITAEGFLREFGRLCSEMLKTSVPVD 

SFFSLLTSERVAKQFPVMTEAITQIRAKGLQTAVL 

SNNFYLPNQKSFLPLDRKQFDVIVESCMEGICKP 
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SEQU) 
NO: 


Method 


Predicted 
beginning 

II 11 CI w LIU w 

location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 

Inritinn 
i uc«i nun 

corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AJa nine OCystcine, D^Aspartic Acid, 
EXJIutamic Acid, F=Phenylalanine, G^GIycine, H^Histidine, 

IcsTcnlpii *•! n** 1*^— T vcinr T z=I j>tir!nA A4=1VT AthirtninA 

N=Asparagtne, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T^Threonine, V=Valine, W=Tryptophan, Y^Tyrosinc, 
X«Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 




• 






DPRIYKLCLEQLGLQPSESIFLDDLGTNLKEAARL 
GIHTIKVNDPETAVKELEALLGFTLRVGVPNTRP 
VKKTMEIPKDSLQKYLKDLLGIQTTGPLELLQFD 
HGQSNPTYYIRLANRDLVLRKKPPGTLLPSAHAI 
EREFRIMKALANAGVPVPNVLDLCEDSSVIGTPF 
YVMEYCPGLIYKDPSLPGLEPSHRRAIYTAMNTV 
LCKMSVDLQAVGLEDYGKQGSTTWV/YSSRRA 
RGALLFLDWELSYPWGDPFADVGYSCLAHYLPS 
SFPVLRGINDCDLTQLGIPAAEEYFRMYCLQMGL 
PPTENWNFYMAFSFFRVAAILQGVYKRSLTGQA 
. SSTYAEQTGKLTEFVSNLAWDFAVKEGFRVFKE 
MPFTNPLTRSYHTWARPQSQWCPTGSRSYSSVPE 
ASPAHTSRGGLVISPESLSPPVRELYHRLKHFME 
QRVYPAEPELQSHQASAARWSPSPLEEDLKVKQP 
W* GGRS GRTS WRLLALGCHT 


3413 


A 


105 


1573 


PESRHQCFSDRSSHFLTMEMEQEKMTMNKELSP 

DAAAYCCSACHGDETWSYNHPIRGRAKSRSLSA 

SPALGSTKEFRRTRSLHGPCPVTTFGPKACVLQN 

PQTMHIQDPASQRLTWNKSPKSVLVIKKMRDAS 

LLQPFKELCTHLMEENMIVYVEKKVLEDPAIASD 

ESFGAVKKKFCTFREDYDDISNQIDFnCLGGDGT 

LLYASSLFQGSVPPVMAFHLGSLGFLTPFSFENFQ 

SQVTQVIEGNAAVVL/RGSRLKVRWKELRGKK 

TAVHNGLGEKGSQAAGLDMDVGKQAMQYQVL 

NEVVIDRGPSSYl^NVDVYLDGHLnTVQGD/G* 

GPQHLSWGP*AFLGKE*RLRLSLSGVIVSTPTGST 

AYAAAAGASMIHPNVPAIMITPICPHSLSFRPIVV 

PAGVELKIMLSPEARNTAWVSFDGRKRQEIRHG 

DSISITTSCYPLPSICVRDPVSDWFESLAQCLHWN 

VRKKQAHFEEEEEEEEEG 


3414 


A 


20 . 


2602 


VIVNKNVNWINYIYYNQQQRAFHELKEKLMSAL 

ALGLPDLTKPFTFYESEREKMAVGVLTQTVGPW 

PRPVAYLSKQLDGVSKGWPPCLRALAATALLAQ 

EADKLTLGQNLNIKAPHAVVTLMNTKGHHWLT . 

NARLTKYQSLPCENPHITIEVCNTLNPTTLLPVSE 

SPGEHNCVEVLDSVYSSRPDLRDQPWASSVDWE 

LYMDGSSFINSQGERCAGYAVVTLDAVIKAKLW 

LQGTSAQKAELIALTRAVELSEGQESLEELLGRY 

FYVSHLPAFAKAVAQLCITCRQHNARQSPTVSPH 

IQAYGAAPFEDLQVDFTEMPKCGGNKYLLVLTC 

TYSGWVEAYPTRTEKAYEVTRVLLRDLEPRFGLP 

LRIGSHNGPVFVADLDCVEINVDTGVIWATWIKN 

EKDPVQLQKGKSGPSCTKGQCNPLELVITOTLDP 

RWKKGERVTLGINGAGLNPRVNILVRGEVYKCS 

LEPVFQTFYDELNVPITEFPGKTRNLFLQLAEHV 

AQSLTVTSCYVCGGTVIADQWPWEARELVPTDP 

VPDEFP A QKNHPDNF WVLKA S IIRQ YYI AR VEKD 

FTLPVGRLHGG/RSNHTEKNPFSKFPKLQTV*AHP 

ESHRDWTAPTGLYWICGHRAYTKLPVASSCVIGTI 

KPSFFLLSIKTGELLGFPVYASR\KSIAIRN*NNDK 

WPPERDQYYGPAT+AQDGSWGYRIPIYMINRIIRL 

QAVLKUTATGRALTILAQQETQMRNAIYQNRLA 

LDYLLAAEGEVCRKFNLTNCCLHIDNQGQVVED 

IVRDMTKVAHVPVQVWHGFDPGAMFRKWFPAL 

GGFKTLIIRVIIVIGTYLLLPRLLPVLLQMIKSFIAT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence <A=Alanine C=Cysteine, D=Aspartic Acid, 
£=Glutamic Acid, F=P!icnylalanine, G«=Glycine, H=Histidinc, 
I=Isoleucine, K^Lysine, L=Leucine r M=Methionine, 
N=Asparagine, P=Pro1ine, Q^GIutaroine, R=Arginine, S=Serine, 
T=Threonine, V=Va!ine, W^Tryptophan, Y^Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possibIe nucleotide insertion 










LVYQNASAQVYYINHY 


3415 


A 


455 


108 


NMSWRGRSTYRPRPRRSLQPPELIGAMLEPTDEE 
PKEEKPPTKSRNPTPDQKREDDSG/SAA*DFKWP 
EPGKPIFQGAMVRPKTGG/CGCEGGY*CQGEDS\P 
KAEHFKMPEAGEGKSQV 


3416 


A 


1 


874 


FFFFQRINFIEHSGSVSLLALACDLGWCEDWSCC 

LVQGGGDLVDWQT^GEDEAGGDTiDSVDEAR 

CKESQQEAQENLREDLCLESFAKDKILQIIEGSER 

EHEETRTKQAALDGEPLGGGQLTAVHLHPSKEQ 

QGQEGGERQRGARTHHWRGWEKGRRVRLRPPS 

GKLRADQPVRKLGGPTPS/TELPGLQPHAPTPHT 

A/PATPTYSPAPDTPNPPVRWKCPLPVEPRTRQLC 

RERTRJCACPPKPRPPLGLPGDPTGPVTHHAPPVS 

PTGASGQERRAEPGAVSYAHASATK 


3417 


A 


243 


847 


CLKYMYTYIFCPNCVSYKMKTDHFSLRYLHSSC 

AEDNKSSVDSSGQAAHPSKGKFFPHGTHWGTQC 

RGHISVLGWQCSCPSTGCRVGLGLAMCQTHAYI 

HTHTHTHTHTPTDYGAHHTDPLQRWGLGPR\KS 

EAGPLPQLSRDQSHPGPLSPGASPRSAGLPGWHP 

AHQEPRARGRCARDGLSLQTRLTNKYDIQCCQE 

MRK 


3418 


A . 


4073 


1000 


LDEYEARLTLANLDDFEEDNEDDDENRVNQEEK 

AAK1TELINKLNFLDEAEKDLATVNSNPFDDPDA 

AELNPFGDPDSEEPITETASPRKTEDSFYNNSYNP 

FKEVQTPQYLNPFDEPEAFVTIKDSPPQSTKRKNI 

RPVDMSKYLYADSSKTEEEELDESNPFYEPKSTP 

PPNNLVNPVQELETERRVKRKAPAPPVLSPKTGV 

LNENTVSAGKDLSTSPKPSPIPSPVLGRKPNASQS 

LLVWCKEVTIWYRGVKITNFTTSWRNGLSFCAI 

LHHFRPDLIDYKSLNPQDIKENNKKAYDGFASIGI 

SRLLEPSDMVLLAIPDKLTVMTYLYQIRAHFSGQ 

ELNVVQIEENSSKSTYKVGNYETDTNSSVDQEKF 

YAELSDLKREPELQQPISGAVDFLSQDDSVFVND 

SGVGESESEHQTPDDHLSPSTASPYCRRTKSDTEP 

QKSQQSSGRTSGSDDPGICSNTDSTQAQVLLGKK 

RLLKAETLELSDLYVSDKKXDMSPPFICEETDEQ 

KLQTLDIGSNLEKEKLENSRSLECRSDPESPIKKT 

SLSPTSKLGYSYSRDLDLAKKKHASLRQTESDPD 

ADRTTLNHADHSSKIVQHRLLSRQEELKERARVL 

LEQARRDAALKAGNKHNTNTATPFCNRQLSDQ 

QDEERRRQLRERARQLIAEARSGVKMSELPSYGE 

MAAEKLKERSKASGDENDNIEDDTNEEIPEGFVV 

GGGDELTNLENDLDTPEQNSKLVDLKLKKLLEV 

QPQVANSPSSAAQKAVTESSEQDMKSGTEDLRT 

ERLQKTTERFRNPWFSKDSTVRKTQLQSFSQYI 

ENRPEMKRQRSIQEDTKKGNEEKAAITETQRKPS 

EDEVLNKGFKDS\SQYVVGELAALENEQKQEDTR 

AALVEKRLRYLMDTGRNTEEEEAMMQEWFML 

WKKNALIRRMNQLSLLEKEHDLERRYELLNRE 

LRAMLAIEDWQKTEAQIORREQLLLDELVALVN 

KRDALVRDLDAQEKQAEEEDEHLERTLEQNKG 

KMAKKEEKCVLQ 


3419 


A 


4073 


1000 


LDEYEARLTLANLDDFEEDNEDDDENRVNQEEK 
AAKITELINKLNFLDEAEKDLATVNSNPFDDPDA 
AELNPFGDPDSEEPITETASPRKTEDSFYNNSYNP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C=Cysteine, D^Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIaninc, G==Glycinc, H^Histidine, 
I=Iso!eucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=ProIinc, Q=Glutaminc, R=Arginine, S=Serine, 
T=Threonine, V= Valine, W^OTryptopban, Y^Tyrosine, 
X=Unknown, *=Stop codon, /possible nucleotide deletion, 
V=possible nucleotide insertion 


• - ... 


- . 






FKEVQTPQYLNPFDEPEAFVTIKDSPPQSTKRKM 

RPVDMSKYLYADSSKTEEEELDESNPFYEPKSTP 

PPNNLVNPVQELETERRVKJIKAPAPPVLSPKTGV 

LNENTVSAGKDLSTSPKPSPIPSPVLGRKPNASQS 

LLVWCKEVTKNYRGVKITNFTTSWRNGLSFCA1 

LHHFRPDLID\^SLNPQDIKENNKKAYDGFASIGI 

SRLLEPSDMVLLAPDKLTVMTYLYQIRArlFSGQ 

ELNWQmENSSKSTYKVGNYETDTOSSVDQEKF . 

YAELSDLKREPELQQPISGAVDFLSQDDSVFVND 

SGVGESESEHQTTDDHLSPSTASPYCRRTKSDTEP 

QKSQQSSGRTSGSDDPGICSNTDSTQAQVLLGKK 

RLLKAETLELSDLYVSDKKKDMSPPFICEETDEQ 

KLQTLDIGSNLEKEKLENSRSLECRSDPESPIKKT 

SLSPTSKLGYSYSRDLDLAKKKHASLRQTESDPD 

ADRTT1.NHADHSSKIVQHRLLSRQEELKERARVL 

LEQARRDAALI<AGNKHNTNTATPFCNRQLSDQ 

QDEERRRQLRERARQLIAEARSGVKMSELPSYGE 

MAAEKLKERSKASGDENDNIEIDTNEEIPEGFVV 

GGGDELTNLENDLDTPEQNSKLVDLKLKKLLEV 

QPQVANSPSSAAQKAVTESSEQDMKSGTEDLRT 

ERLQKTTERFRNPVVFSKDSTVRKTQLQSFSQYI 

ENRPEMKRQRSIQEDTKKGNEEKAAITETQRKPS 

EDEVLhOCGFKDS\SQYVVGELAALENEQKQIDTR 

AALVEKRLRYLMDTGRNTEEEEAMMQEWFML 

VNKKNALIRRMNQLSLLEKEHDLERRYELLNRE 

LRAMLAIED WQKTEAQKRREQLLLDELVALVN 

KRDALVRDLDAQEKQAEEEDEHLERTLEQNKG 

KMAKKEEKCVLQ 


3420 


A 


612 


1058 


ENLGPNYSHRLLHHPTFYKKIHKKHHEWTAPIG 

VISLYAHPffiHAVSNMLPVIVGPLVMGSHLSSITM 

WFSIALIITTISHCGYHLPFLPSPEFHDYHHLKFN 

QCYGVLGVLDHLHGTDTMFKQTKAYERHVLLL 

GFTPLSES1PDSPK 


3421 


A 


23 


2005 


LLTPCDGRIPGRPSVGAESGSDFQQRRRRRRDPE 

EPEKTELSERELAVAVAVSQENDEENEERWVGP 

LPVEATLAKKRKVLEFERVYLDNLPSASMYERS 

YMHRDVITHWCTKTDFHTASHDGHVKFWKKIE 

EGIEFVKHFRSHLGVIESIAVSSEGALFCSVGDDK 

AMKWDVWFDMINMLKLGYFPGQCEWIYCPG 

DAISSVAASEKSTGKIFIYDGRGDNQPLHIFDKLH 

TSPLTQIRLNPVYKAVVSSDKSGMffiYWTGPPHE 

YXFPKNVNWEYKTDTDL YEFAKCKA YPTS VCFS 

PDGKXIATIG SDRX VRIFRFVTGKLMRVFDESLS 

MFTELQQMRQQLPDMEFGRRMAVERELEKVDA 

VRLINIVFDETGHFVLYGTMLGIKVINVETNRCV 

RILGKQENIRVMQLALFQGIAKKHRAATTffiMKA 

SEOTVLQNIQADPTIVCTSFKKNRFYMFTKREPE 

DTKSADSDRDVFNEKPSK^EVMAATQAEGPKRV 

SDSAIIHTSMGD1HTKLFPVECPKTVENFCVHSRN 

GYYNGHTFHRIDCGFMIQTGDPTGTGMGGESIWG 

GEFEDEFHSTLRHDRPYTLSMANAGSNTNGSQFF 

ITVVPITWLDNKHTVFGRVTKGMEVVQRISN\VK 

VNPKTDKPYEDVSIIN1TVK 


3422 


A 


2486 


433 


FVLVCAPLTWAGARHRRMAASKKPPRVRVNHQ " 
DFQLRNLRIIEPNEVTHSGDTGVETDGRMPPKVT 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
. location 
corresponding 
to first amino 
acid residue of 
peptide, 
sequence 


Predicted end 
nucleotide 
locution 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Phenyln!aninc, G=Glycine. H=Histidine, 

I— 1 CftlPllfin A hTciI vein a T angina ^jTt— Ki/t nl n n 

m. isuicuuiiCf »v"ijysinej ij— LtcucinC} ivi=jYieinioinnc> 
N=Asparaginc, P^Proline, Q=Glutaminc, R=Arginine, S^Serine, 
"^Threonine, V=VaIine, W«=Tryptophan, Y=Tyrosine, 
X=Un known, *=*Stop codon, A=possibIe nucleotide deletion, 
^possible nucleotide insertion 


- 








SELLRQLRQAMRNSEYVTEPIQAYDPSGDAHQSE 

YIAPCDCRRAFVSGFDGSAGTAIITEEHAAMWTD 

GRYFLQAAKQMDSNWTLMKMGLKDTPTQEDW 

LVSVLPEGSRVGVDPLIIPTDYWKKMAKVLRSA 

GHHLIPVKENLVDKIWTDRPERPCKPLLTLGLDY 

TGISWKDKVADLRLBGVLAERNVMWFVVTALDEI 

AWLFNLRGSDVEHNPVFFSYAnGLETIMLFIDGD 

RIDAPSVKEHLLLDLGLEAEYRIQVHPYKSILSEL 

KALCADLSPREKVWVSDKASYAVSETIPKDHRC 

CMPYTPIClAKA\VKNSA\ESEGMRRAHIKDAVAL 

CELFNWLEKEVPKGGVTEISAADKAEEFRRQQA 

DFVDLSFPTISSTGPNGAIim'APVPETNRTLSLDE 

VYLIDSGAQYKDGTTDVTRTMHFGTPTAYEKEC 

FTYVLKGHIAVSAAVFPTGTKGHLLDSFARSAL 

WDSGLDYLHGTGHGVGSFLNVHEGPCGISYKTF 

SDEPLEAGMIVTDEPGYYEDGAFGIRIENWLVV 

PVKTKYNFNNRGSLTFEPLTLVPIQTKMIDVDSL 

TDKECD WLNNYHLTCRDVIGKELQKQGRQEAL 

EWLIRETQPISKQH 


3423 


A 


5515 


934 


FKMPENPATDKLQVLQVLDRLKMKLQEKGDTS 

QNEKLSMFYETLKSPLFNQELTLQQSIKQLKGQL . 

NHIPSDCSANFDFSRjtGLLVFTDGSITNGNVHRPS 

KNfSTVSGLFPWTPKLGNEDFNSVIQQMAQGRQIE 

YIDIERPSTGGLGFSVVALRSQNLGKVDIFVKDV 

QPGSVADRDQRLKENDQ1LAINHTPLDQNISHQQ 

AIALLQQTTGSLRLIVAREPVHTKSSTSSSLNDTT 

LPETVCWGHVEEVELINDGSGLGFGIVGGKTSGV 

WRTIVPGGLADRDGRLQTGDHTLKJGGTNVQG 

MTSEQVAQVLRNCGNSVRMLVARDPAGDISVTP 

PAPAALPVALPTVASKGPGSDSSLFETYNVELVR 

KDGQSLGIRIVGYVGTSHTGEASGIYVKSIIPGSA 

AYHNGHIQVNDKIVAVDGVNIQGFANHDVVEVL 

RNAGQVVHLTLVRRKTSSSTSPLEPPSDRGTVVE . 

PLKPPALFLTGAVETETNVDGEDEEIKERIDTLKN 

DMQALEKLEKVPDSPENELKSRWENLLGPDYEV 

MVATLDTQIADDAELQKYSKLLPIHTLRLG VEV . 

DSFDGHHYISSIVSGGPVDTLGLLQPEDELLEVN 

GMQLYGKSRREAVSFLKEVPPPFTLVCCRRLFDD 

EASVDEPRRTETSLPETEVDHNMDVNTEEDDDG 

ELALWSPEVKJVELVKDCKGLGFSILDYQDPLDP 

TRSVIVIRSLVADGVAERSGGLLPGDRLVSVNEY 

CLDNTSLAEAVEILKAVPPGLVHLGICKPLVEDN 

EEESCYILHSSSNEDKTEFSGTIHDINSSLELEAPK 

GFRDEPYFKEELVDEPFLDLGKSFHSQQKEIEQS 

KEAWEMHEFLTPRLQEMDEEREMLVDEEYELY 

QDPSPSMELYPLSHIQEATPVPSVNELHFGTQWL 

HDNEPSESQEARTGRTVYSQEAQPYGYCPENVM ' 

KENFVMESLPSVPSTEGNSQQGRFDDLENLNSLA 

KTSLDLGMEPNDVQGPSLLIDLPVVAQRREQEDL 

PLYQHQ ATRVISKAS A YTGMLS SRYATDTCELPE 

REEGEGEETPNFSHWGPPRIVEIFREPNVSLGISIV 

GGQTV1KRLKNGEELKGIFIKQVLEDSPAGKTNA 

LKTGDKBLEVSGVDLQNASHSEAVEAIKNAGNP 

WFWQSLSSTPRV1PNVHNKANKITGNQNQDTQ 

EKKEKRQGTAPPPMKLPPPYKALTDDSDENEEE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine OCysteine, D»Aspartic Acid, 
E-Glutamic Acid, F=Phenylnlamne, G==Glycine, H=Histidine, 
I=Isoleucine^ K— Lysine, l/^Lcucine, M == Methionine ) 
N=Asparagine, P=ProIine, Q=GIutamine, R=Arginine, S=Scrinc> 
T^Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
V^possible nucleotide insertion 










DAFTDQKIRQRYADLPGELHIIELEKDKNGLGLS 

LAGNKDRSRMSIFVVGINPEGPAAADGRMHIGD 

ELLEINNQ]LYGRSHQN\ASAIIKTAPSKVKLVFIR 

NEDAVNQMAVTPFPVPSSSPSSIEDQSGTEPISSEE 

\DGSLEWGIKQLPESESFKLAVSQMKQQKYPTKV 

SFSSQEIPLAPASSYHSTDADFTGYGGFQAPLSVD 

PATCPIVPGQEMIIEISKRRSGLGLSIVGGKDTPLV 

NGVDLKNSSHEEAITALRQTTQKVRLVVYRDEA 

HYRDEENLEIFPVDLQKKAGRGLGLSIVGKR 


3424 


A 


2223 


1162 


HASERVVQLPDFVWDQYTHSLGRVEREFKNRKR 

HTRRVKLVFDKGLPARPKSPLDPKKDGESLSYS 

MLPLSDGPEGSSSRPQMIRGRLCDDTKPETFNQL 

WTVEEQKKLEQLLIKYPPEEVESRRWQIOADELG 

NRTAKQVASRVQKYFIKLTKAGIPVPGRTPNLYI 

YSKKSSTSRRQHPLNKHLFKPVGTFMTSHEPPVY 

MDEDDDRSCFHSHMNTAVEDASDDESIPIMYRN 

LPEYKELLQFKKLKKQKLQHMQAESGFVQHVGF 

KCDNCGIEPIQG\VRW\HCR\DCPP\EMSLVDFC\DS 

C\SDCLHET\DIHKGDHQLEP1YRS\ETFLDRDYCV 

SQGTSYNYLDPNYFPANR 


3425 


A 


2223 


1162 


HASERVVQLPDFVWDQYTHSLGRVEREFKNRKR 

HTRRVKLVFDKGLPARPKSPLDPKKDGESLSYS 

MLPLSDGPEGSSSRPQMIRGRLCDDTKPETFNQL 

WTVEEQKKLEQLLIKYPPEEVESRRWQKIADELG 

NRTAKQVASRVQKYFIECLTKAGIPVPGRTPNLYI 

YSKKSSTSRRQHPLNKHLFKP\GTFMTSHEPPVY 

MDEDDDRSCFHSHMNTAVEDASDDESIPIMYRN . 

LPEYKELLQFKKLKKQKLQHMQAESGFVQHVGF 

KCDNCGIEPIQG\VRW\HCR\DCPP\EMSL\DFC\DS 

C\SDCLHET\DIHKGDHQLEPIYRS\ETFLDRDYCV 

SQGTSYNYLDPNYFPANR 


3426 


A 


2 


1553 


LFVWHDDPRWGTPRYWLGALYRNQQSSPTAPP 

GLLPLEYFPAAPHCSHSRQWRCSQTHRIHHHPQ 

MLGPCRQEICGITMAAGTLYTYPENWRAFRAL1 

AAQYSGAQVRVLSAPPHFHFGQTNRTPEFLRKFP 

AGKVPAFEGDDGFC VFESNAIA YYVSNEELRG ST 

PEAAAQWQWVSFADSDIYPPASTWVFPTLGIM 

HHNKQATENAKEEVRRILGLLDAYLKTRTFLVG 

ERVTLADITWCTLLWLYKQVLEPSFRQAFPNTN 

RWFLTCINQPQFRA\VFGEVKLCEKMAQF\DAKK 

FAETQPKKDTPRKEKGSREEKQKPQAERKEEKK 

AAAPAPEEEMDECEQALAAEPKAKDPFAHLPKS 

TFVLDEFKRICY SNEDTLS VALPYFWEHFDKDGW 

SLWYSEYRFPEELTQTFMSCNLITGMFQRLDKLR 

KN AF ASVILFGTNNSS SISGVWVFRGQELAFPLSP 

DWQVDYESYTWRKLDPGSEETQTLVREYFSWE 

GAFQHVGKAFNQGKIFK 


3427 


A 


755 


52 


TAARRRQKGTAARRRQKGTAARRRQKGTAARR 

RQKGTAARRRQKGTAARRRQKGTAARRRQKGT 

AARRRQKGTAARRRQKGTAARRRQKGTAARRR 

QKGLSNLDAAEWLPPKKGVGEKKKGPFLAINEV 

VT\REYPINILKRIHGVGFKKRAPRALK[EIRKFAM 

KEMGTPDVRIDTRLNKAVWAKGIRNVPYRIRVR 

LSRKJRNEDEDSPNKLYTLVTYVPVTTFKNLQTV 

NVDEN 
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SEQTD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phcnylalanine, G=GJycine, H«Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methtonine, 
N=Asparaginc, P=Proline, Q=Glutaniine, R=Argininc, S=Serine, 
T^Threonine, V«Valine, W=Tryptophan, Y^Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 


3428 


A 


4 


1939 


LPLSLSFSEMPLPLLPMDLKGEPGPPGKPGPWGP 

PGPPGFPGKPGHGKPGLHGQPGPAGPPGFSRMG 

KAGPPGLPGNVGPPGQPGLRGEPGIRGDQGLRGP 

PGPPGLPGPSGITIPGKPGAQGVPGPPGFQGEPGP 

QGEPGPPGDRGLKGDNGVGQPGLPGAPGQGGAP 

GPPGLPGPAGLGKPGLDGLPGAPGDKGESGPPG 

VPGPRGEPGAVGPKGPPGVDGVGVPGAAGLPGP 

QGPSGAICGEPGTRGPPGLIGPTGYGMPGLPGPKG 

DRGPAGVPGLLGDRGEPGEDGEPGEQGPQGLGG 

PPGLPGSAGLPGRRGPPGPKGEAGPGGPPGVPGI 

RGDQGPSGLAGKPGVPGERGLPGAHGPPGPTGP 

KGEPGFTGRPGGPGVAGALGQKGDLGLPGQPGL 

RGPSGIPGLQGPAGPIGPQGLPGLKGEPGLPGPPG 

EGRAGEPGTAGP\RGPPGVPGSPGITGPPG\LPGPP 

GAPGAFDETGIAGLHLPNGGVEGAVLGKGGKPQ 

FGLGELSAHATPAFTAVLTSPLPASGMPVKFDRT 

LYNGHSGYNPATGIFTCPVGGVYYFAYHVHVKG 

TNVWVALYKNNVPATYTYDEYKKGYLDQASG - 

GAVLQLRPNDQVWVQMPSDQANGLYSTEYIHSS 

FSGFLLCPT 


3429 


A. 


212 


1075 


EGLTGPCERVPFLLGRGPPHGATRAGHRRAVRW 

AGPESLPPLPRSLIMDSPRAGTHQGPLDAETEVG 

ADRCTSTA YQEQRPQVEQVGKQAPLSPGLPAMG 

GPGPGPCEDPAGAGGAGAGGSEPLVTVTVQCAF 

TVALRARRGADLSSLRALLGQALPHQ\AQLGQLS 

YLAPGEDGHWVPIPEEESLQRAWQDAAACPRGL 

QLQCRGAGGRPVLYQWAQHSYSAQGPEDLGF 

RQGDTVDVLCEVDQAWLEGHCDGRIGIFPKCFV 

VPAGPRMSGAPGRLPRSQQGDQP 


3430 


A 


.799 


1989 


mKYINIRKKIKLLSPLPPLWSHLALLQASATKWV 

LTPAAFAGKLLSVFRQPLSSLWRSLVPLFCWLRA 

TFV^LATKRRKQQLVLRGPDETKEEEEDPPLPTT 

PTSVNYHFTRQCNYKCGFCFHTAKTSFVLPLEEA- 

KRGLLLLK\EAG\LEKINFSGG\EPFLQDRGEYLGK 

LVRFCKVELRLPSVSI\VSNGSLIRERWFQNYG\E 

YLDILAISCDSFDEEVNCPMGRGN\GKKNHVENL 

QKL\RRWCRDYRVPFKINSVINPF\NVEEDMTEQI 

KALNPVRWKVFQCLLIEGENCGEDA\LREAERFV 

IGDEEFERFLERHKEVSCLWESNQKMKDSYLIL 

DEYMRFLNCRKGRKDPSKSILDVGVEEAIKFSGF 

DEKMFLKRGGKYIWSKADLKLDW 


3431 


A 


5468 


2146 ... 


ACGFLPGRCHFSTFKQCQEWLSRLSRATARPAKP 

EDLFAFAYHAWCLGLTEEDQHTHLCQPGEHIRC 

RQEAELARMGFDLQNVWRVSHINSNYKLCPSYP 

QKLLVPVWITDKELENVASFRSWKRIPVVVYRH 

LRNGAAIARCSQPEISWWGWRNADDEYLVTSIA 

KACALDPGTRATGGSLSTGNNDTSEACDADFDS 

SLTACSGVESTAAPQKLLILDARSYTAAVANRAK 

GGGCECEEYYPNCEVVFMGMANIHAIRNSFQYL 

RAVCSQMPDPSNWLSALESTKWLQrILSVMLKA 

AVLVANTVDREGRPVLVHCSDGWDRTPQIVALA 

KILLDPYYRTLEGFQVLVESDWLDFGHKFGDRC 

GHQENVEDQNEQCPVFLQWLDSVHQLLKQFPCL 

FEFNEAFLVKLVQHTYSCLYGTFLANNPCVEREK 

RNIYK/RGTCSVWALLRAGNKNFHNFLYTPSSD 
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SEQID 
NO: 


Metbod 


Predicted 

beginning. 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=*Alanine OCysteine, D=Aspartic Acid, 
E=Glntamic Acid, F=PhenyIalanine, G=Glycine, H=Histidine, 
I=Isolcucine, K=Lysine, l^Leucine, M=Mcthionine, 
N=Asparagine, P^Proline, Q=GIutamine, R^Arginine, S=Serine, 
T^Thrconine, V=Valinc, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V*possible nucleotide insertion 


- 








MVLHPVCHVRALHLWTAVYLPASSPCTLGEEN 

MDLYLSPVAQSQEFSGRSLDRLPKTRSMDDLLS 

ACDTSSPLTRTSSDPNLNNHCQEVRVGLEPWHS 

NPEGSETSFVDSGVGGPQQTVGEVGLPPPLPSSQ 

KDYLSNKPFKSHKSCSPSYKLLNTAVPREMKSNT 

SDPEIKVLEETKGPAPDPSAQDELGRTLDGIGEPP 

EHCPETEAVSALSKVISNKCDGVCNFPESSQNSPT 

GTPQQAQPDSMLGVPSKCVLDHSLSTVCNPPSA 

ACQTPLDPSTDFVLNQDPSGSVASISHQEQLSSVP 

DLTHGEEDIGKRGNNRNGQLLENPRFGKMPLEL 

VRKPISQSQISEFSFLGSNWDSFQGMVTSFPSGEA 

TPRRLLSYGCCSKRPNSKQMRATGPCFGGQWAQ 

REGVKSPVCSSHSNGHCTGPGGKNQMWLSSHPK 

QVSSTKPVPLNCPSPVPPLYLDDDGLPFPTDVIQH 

RLRQffiAGYKQEVEQLRRQVRELQMRLDIRHCC 

APPAEPPMDYEDDFTCLKESDGSDTEDFGSDHSE 

DCLSEASWEPVDKKETEVTRWVPDHMASHCYN 

CDCEFWLAKRRHHCRNCGNVFCAGCCHLKLPIP 

DQQLYDPVLVCNSCYEHIQVSRARELMSQQLKK 

PIATASS 


3432 


A 


36 


1873 


MTFFSSVADFIGLDPRIAAWLIDPSDATPSFEDLV 

EKYCEKS'ITVKVNSTYGNSSRMVNQNVRENLKT . 

LYRLTMDLCSKLKDYGLWQLFRTLELPLIPILAV 

MESHAIQVNKEEMEKTSALLGARLKELEQEAHF 

VAGERFLITSNNQLREILFGKLKLHLLSQRNSLPR 

TGLQKYPSTVSEALNALRDLHPLPKIILEYRQVH 

K1KSTFVDGLLACMKKGSISSTWNQTGTVTGRLS 

AKHPNIQGISKHPIQITTPKNFKGKEDKILTISPRA 

MFVSSKGHTFLAADFSQIELRILTHLSGDPELLKL 

FQESERDDVFSTLTSQWKDVPVEQVTHADREQT 

KKWYAWYGAGKERLAACLGVPIQEAAQFLES 

FLQKYKKIKDFARAAIAQCHQTGCVVSIMGRRR 

PLPRIHAHDQQLRAQAERQAVNFWQGSAADLC 

KLAMIHVFTAVAASHTLTARLVAQIHDELLFEVE 

DPQEPECAALVRRTMESLEQVPLKVSLSAGRSWG 

HLVPLQEAW\ALRQAHVALSLPATAWLPLGPLP 

APSPHPCIFRLHFVCSPRQQWEERTGFQQSIVWPS 

PRSPALYAPGRINPLGLGWPAIPWSKCLCKALKK 

K 


3433 


A 


1481 


476 .. 


IPPKERAPGIRASCLA1TAGARPTSYGRVGCEGDV 

RLSPV SPLL APPDPRLASRWEGRSRMKGKKGIV A 

ASGSETEDEDSMDIPLDLSSSAGSGKRRRRGNLP 

KESVQILRDWLYEHRYNAYPSEQEKALLSQQTH 

LSTLQVCNWFENAJUIRLLPDMLRKDGKDPNQFTI 

SRRGAKISETSSVESVMGDCNFMPALEETPFHSF'H 

AGPNPTLG\RPLSAKP/SQSPGSVLARPSVICHTTV 

TAffiRLSLSLSCQSVGCGQN1\DIQQIAT\RNLKDS 

SLMYPEDTCKSGPSTNTQSGLFNTPPPTPPDLNQ 

DFSGFQLLVDVALKRAAEMELQAKLTA 


3434 


A . 


1720 


1243 


NGPVPPGGSKTKWAGGSAAEGSPRLSPSPGAAQ 

VPALLRGEPRGGAAAGSFWKPLHQHSCGLRPPP/ 

PPD/RLSRLPGKTLSACDRENGARRPLLLGSTSFIP 

JGRRTYASAAEPVGSKAVLVTGCDSGFGFSLAKH 

LHSKGFLVFAGCLMKDKGHDGVKELDSLNSDRL 

RTVQLNVCSSEEVEKV/VGDCPLEPEGP\EKGMW 
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SEQID 
NO: 


Method 


Predicted 
beginning 

1 1 U LI CU LI U C 

location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 

corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I = Isoleucine^ K = Lysinej L^Leucine, M = IVIcthionine, 
N^Asparagine, P=ProIine, Q=Glutaniinc, R^Arginine, S=Serine, 
T«Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop cod on, /^possible nucleotide deletion, 
V= possible nucleotide insertion 










GLVNNAGISTFGEVEFTSLETYKQVAEVNLWGT 

VRMTKSFLPLIRRAKGRVVNI S SMLGRMANP AR 

SPYCITKFGVEAFSDCLRYEMYPLGVKVSVVEPG 

NFIAATSLYSPESIQAIAKKMWEELPEVVRKDYG 

KKYFDEKIAKMETYC S S G STDTSPVID A VTHALT 

ATTPYTRYHPMDYYWWLRMQIMTHLPGAISDM 

IYIR 


3435 


A 


842 


3595 


ENQQQMLVAKEQRLHFLKQQERRQQQSISENEK 
LQKLKERVEAQENKLKKIRAMRGQVDYSKIMN 
GNLSAEffiRFSAMFQEKKQEVQTAILRVDQLSQQ 
LEDLKKGKLNGFQSYNGKLTGPAAVELKRLYQE 
LQIRNQLNQEQNSKLQQQKELLNICRNMEVAMM 
DKRISELRERLYGKKIQACEKVFLNRVNGTSSPQ 
SPLSTSGRVAAVGPYIQVPSAGSFPVLGDPIKPQS 
LSIASNAAHGRSKSANDGNWPTLKQNSSSSVKP 
VQVAGADWKDPSVEGSVKQGTVSSQPVPFSALG 
PTEKPGIEIGKVPPPIPGVGKQLPPSYGTYPSPTPL 
GPGSTSSLERRKEGSLPRPSAGLPSRQRPTLLPAT 
GSTPQPGSSQQIQQRISVPPSPTYPPAGPPAFPAGD 
SKPELPLTVAIRPFLADKGSRPQSPRKGPQTVNSS 
SIYSMYLQQATPPKNYQPAAHSALNKSVKAVYG 
KPVLPSGSTSPSPLPFLHGSLSTGTPQPQPPSESTE 
KEPEQDGPAAPADGSTVESLPRPLSPTKLTPIVHS 
PLRYQSDADLEALRRKLANAPRPLKKRSSITEPE 
GPGGPN1QKLLYQRFNTLAGGMEGTPFYQPSPSQ 
DFMVTLADVDNGNTNFANGNLEELPPAQPTAPLP 
AEPAPSSDANDNELPSPEPEELICPQTTHQTAEPA 
. EDNNNNVATVPTTEQIPSPVAEAPSPGEEQVPPA 
PLPPASHPPATSTNBCRTNLKKPNSERTGHGLRVR 
FNPLALLLDASLEGEFDLVQRJIYEVEDPSKPNDE 
GITPLHNAVCAGHHHIVKFLLDFGVNVNAADSD 
GWTPLHCAASCNSVHLCKQLVESGAAIFASTISD 
IETAADKCEEMEEGYIQCSQFLYGVQEKLGVMN 
KGVAYALWDYEAQNSDELSFHEGDALULRRKD " 
E 


3436 1 


A 


3 


2604 


GSTrL^SEKMKTGRSALVVTDTGDMSVLNSPRHQ 

SCIMHVDMDCFFVSVGIRNRPDLKGKPVAVTSN 

RGTGRAPLRPGANPQLEWQYYQNKILKGKADIP 

DSSLWENPDSAQANGIDSVLSRAEIASCSYEARQ 

LGIKNGMFFGHAKQLCPNLQAVPYBFHAYKEVA 

QTLYETLAS\YTHNIEAVSCDEALVDITEDLAETK 

LTPDEFANAVRMEIKDQTICCAASVGIGSNILLAR 

MATRKAKPDGQYHLKPEEVDDFIRGQLVTNLPG 

VGHSMESKl^SLGIKTCGDLQYMTMAKLQKEF 

GPKTGQMLYRFCRGLDDRPVRTEKERKSVSAEI 

NYGIRFTQPKEAEAFLLSLSEEIQRRLEATGMKG 

KRLTLKIMVRKPGAPVETAKFGGHGICDNIARTV 

TLDQATDNAKIIGKAMLNMFHTMKLNISDMRGV 

GIHVNQLVPTNLNPSTCPSRPSVQSSHFPSGSYSV 

RDVFQVQKAKKSTEEEHKEVFRAAVDLEISSASR 

TCTFLPPFPAHLPTSPDTNKAESSGKWNGLHTPV 

SVQSRLNLSffiVPSPSQLDQSVLEALPPDLREQVE 

QVCAVQQAESHGDKKKEPVNGCNTGILPQPVGT 

VLLQIPEPQESNSDAGINLIALPAFSQVDPEVFAA 

LPAELQRELKAAYDQRQRQGENSTHQQSASASV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E-Glutamic Acid, F=Phenylalanine, G=Glycine, H-Histidine, 
I=Isolcucine, K=Lysine, L=Lcucina, M=Mcthionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S^Serine, 
T=Threoninc, V=VaIinc, W=Tryptopfaan, Y=Tyrosine, 
X=Un known, * is Stop codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion 










PKNPLLHLKAAVKEKKRNKKKKTIGSPKRIQSPL 
N>naLNSPAKTLPGACGSPQKLIDGFLKHEGPPA 
EKPLEELSASTSGVPGLSSLQSDPAGCVRPPAPNL 
AGAVEFNDVKTLLREWITTISDPMEEDILQVVKY 
CTDLIEEKDLEKLDLVIKYMKRLMQQSVESVWN 
MAFDFELDNVQWLQQTYGSTLKVT 


3437 


A 


32 


4038 


SLLRLLKAQWGSSGAASEPVVLGEEGCGFPSTNE 

YPDLEEERATYPQEEDRFLTPGRAQLLWSPWSPL 

DQEEACASRQLHSLASFSTVTARRNPLHNPWGM 

ELAASENTDSPSPRPLRPG VTLPPGALTMNTKDT 

TEVAENSHHLKIFLPKKLLECLPRCPLLPPERLRW 

NTNEEIASYLITFEKHDEWLSCAPKTRPQNGSIIL 

YNRKKVKYRKDGYLWKKRKDGKTTREDHMKL 

KVQGMECLYGCWHSSIVPTFHRRCYWLLQNPD 

IVLVHYLNVPALEDCGKGCSPIFCSISSDRREWLK 

WSREELLGQLBCPMFHGIKWSCGNGTEEFSVEHL 

VQQILDTHPTKPAPRTHACLCSGGLGSGSLTHKC 

SSTKHRnSPKVEPRALTLTSIPHPHPPEPPPLIAPLP 

PELPKAHTSPSSSSSSSSSGFAEPLEIRPSPPTSRGG 

SSRGGTAILLLTGLEQRAGGLTPTRHLAPQADPR . 

PSMSLAVWGTEPSAPPAPPSPAFDPDRFLNSPQR 

GQTYGGGQGVSPDFPEAEAAHTPCSALEPAAAL 

EPQAAARGPPPQSVAGGRRGNCFFIQDDDSGEEL 

KGHGAAPPIPSPPPSPPPSPAPLEPSSRVGRGEALF 

GGPVGASELEPFSLSSFPDLMGELISDEAPSIPAPT. 

PQLSPALSTITDFSPEWSYPEGGVKVLITGPWTEA 

AEHYSCVFDHIAVPASLVQPGVLRCYCPAHEVG 

LVSLQVAGREGPLSASVLFEYRARRFLSLPSTQL 

DWLSLDDNQFRMSE.ERLEQMEKRMAEIAAAGQ 

VPCQGPDAPPVQDEGQGPGFEARVWLVESMEP 

RSTWKGPERLAHGSPFRGMSLLHLAAAQGYARL 

mTLSQWRSVETGSLDLEQEVDPLNVDHFSCTPL 

3V1WACALGHLEAAVLLFRWNRQALSIPDSLGRLP 

LSVAHSRGHVRLARCLEELQRQEPSVEPPFALSP 

PSSSPDTGLSSVSSPSELSDGTFSVTSAYSSAPDGS 

PPPAPLPASEMTMEDMAPGQLSSGVPEAPLLLM 

DYEATNSKGPLSSLPALPPASDDGAAPEDADSPQ 

AVDVPVDMTSLAKQIIEATPERIKREDFVGLPEA 

GASMRERTGAVGLSETMSWLASYL\ENVDHFPS 

STPPSEL\PFER\GRLGLSLTAPSWAEFLSCIPPVGK 

IGKLIFALLTL\SD\QEQRELYEAARVIQTAFRKYIC 

GRRLKEQQEVAAAVIQRCYRKY1CQLTWIALKFA 

LYKKMTQAAILIQSKFRSYYEQKRFQQSRRAAV 

LIQQHYRSYRRRPGPPHRTSATLPARNKGSFLTK 

KQDQAARKIMRFLRRCRHRMRELKQNQELEGLP 

QPGLAT 


3438 


A 


469 


2602 . 


FGRLLWGTAFKSWKMKAPPHLILLYATFTQSLK 

WTKRGSADGCTDWSmiKKYQVLVGEPVRIKC 

ALFYGY1RTNYSLAQSAGLSLMWYKSSGPGDFE 

EPIAFDGSRMSKEEDSIWFRPTLLQDSGLYACVIR 

NSTYCMKVSISLTVGENDTGLCYNSKMKYFEKA 

ELSKSKEISCRDffiDFLLPTREPEILWYKECRTKT 

\\OU>SIVFKRDTLLIREVREDDIGNYTCELKYGGF 

VVRRTTELTVTAPLTDKPPKLLYPMESKLTIQET 

QLGDSANLTCRAFFGYSGDVSPLIYWMKGEKF1E 
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SEQID 
NO: 


Method 


Predicted 
beginning 

location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine OCysteine, D=Aspartic Acid, 
E^GIutamic Acid, ^Phenylalanine, G==Glycine, H=Histidine, 

T~Tcn1plirinp TC— T vcinp T .=T ^iif-ini* 1M— IVTi*f h inninp 

N=Asparagine, P=ProIine, Q^Glutamine, R«Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X^Unknown, ^Stop codon, A=possible nucleotide deletion, 
\-possible nucleotide insertion 










DLDENRVWESDIVKILKEHLGEQEVSISLIVDSVEE 

GDLGNYSCYVENGNGRRHASVLLHKRELMYW 

ELAGGLGAILLLLVCLVTIYKGYKIEIMLFYRNHF 

GAEELDGDNKDYDAYLSYTKVDPDQWNQETGE 

EERFALEILPDMLEKHYGYKLFDPDRDLIPTGTYI 

EDVARCVDQSKJU^nVMTPNYVVRRGWSIFELET 

RLRNMLVTGEIKVILIECSELRGIMNYQEVEALK 

HTIKLLTVIKWHGPKCNKLNSKFWKRLQYEMPF 

KRIEPITHEQALDVSEQGPFGELQTVSAISMAAAT 

STALATAHPDLRSTTHNTYHSQMRQKHYYRSYE 

YDVPPTGTLPLTSIGNQHTYCNIPMTLDSfGQRPQT 

KSSREQNPDEAHTNSAILPLLPRETSISSVIW 


3439 


A 


251 


2037 


GPGNSSlLIGGGHLFLIRSCLlNfLLLLNSKENTEHT 
MAKKVAVIGAGVSGLSSIKCCVDEDLEPTCFERS 
DDIGGLWKFTERGSSLSVMIWPLALSLLRHGGFC 
YSDFPFHEDYPNFMNHEKFWDYLQEFAEHFDLL 
KYIQFKTTVCGITKJRPDFSETGQWDWTETEGKQ 
NRA VFD A V M VC TGHFLNPHLPLE AFPGIHKFKG 
QILHSQEYKIPEGFQGKRVLVIGLGNTGGDIAVEL 
SRTAAQVLLSTRTGTWVLGRSSDWGYPYNMMV 
TRRCCSFIAQVLPSRFLNWIQERKLNKRFNHEDY 
GLSITKGKIOVKFIVNDELPNCILCGAITMKTSVIE 
FTETSAVFEDGTVEENIDVVIFTTGYTFSFPFFEEP 
LKSLCTKKIFLYKQVFPLNLERATLAIIGLIGLKGS 
. ELSGTELQARWVTRWKGLCKRPASQKLMMEAT. 
EKEQLIKRG VFKDTSKDKFDYIA YMDDIAACIGT 
KPSIPLLFLKDPRLA WEVFFGPCTPYQYRVLMGPG 
KWDGARNAILTQWDRTLKPLKTR1VPDSSKAWP 
SM\SHYLKAWGAPVLLASLLLICK\SSLFLKLVRD 
KLQDRMSPYLVSLWRG 


3440 


A 


1 


3533 


IMPCGSSRLLRGCWTHPNEPVSDLSYFDCIESVM 

ENSKVLGESMAGISQNAKTGDLPAFGECVGIASK 

ALCGLTEAAAQAAYLVGIFDPNSQAGHQGLVDP 

IQFARANQAIQMACQNLVDPGSSPSQVLSAATIV 

AKHTSALCNACRIASSKTANPVAKRPIFVQSAKE 

VANSTANLVKTIKALDGDFSEDNRNKCRIATAPL 

IEAVENLTAFASNPEFVSIPAQISSEGSQAQEPILV 

S AKPMLES S S YLIRTARSL AJNPKDPPTWS VL AG 

HSHTVSDSIKSLITSERDKAPGQRECDYSIDGINRC 

IRDIEQASLAAVSQSLATRDDISVEALQEQLTSW 

QEIGHLIDPIATAARGEAAQLGHKGTQLASYFEP 

LILAAVGVASKILDHQQQMTVLDQTKTLAESAL 

QMLYAAKEGGGNPKAQHTHDAITEAAQLMKEA 

VDDIMVTLNEAASEVGLVGG3MVDAIAEAMSKL 

DEGTPPEPKGTFVDYQTTVVKYSKAIAVTAQEM 

MTKSVTNPEELGGLASQMTSDYGHIAFQGQMA 

AATAEPEEIGFQIRTRVQDLGHGCIFLVQKAGVAL 

QVCPTDSYTKRELIECARAVTEKVSLVLSALQAG 

NKGTQACITAATAVSGI1ADLDTTIMFATAGTLN 

AENSETFADHRENILKTAKALVEDTKLLVSGAAS 

TPDKLAQAAQSSAATITQLAEWKLGAASLGSD 

DPETQWLINAIKDVAKALSDLISATXGAASKPV 

DDPSMYQLKGAAKVMVTNVTSLLKTVKAVEDE 

ATRGTRALEATTECnCQELTVFQSKDVPEKTSSPE 

ESIRMTKGITMATAKAVAAGNSCRQEDVIATAN 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
. acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine OCysteine, D=Aspartic Acid, 
E=GIutamic Acid, F^Phenylalanine, G=GIycine, H=Histidine, 
l=Isoleucine, K=Lysine, LHLeucine, M^Methionine, 
N=Asparagine, P=Proline, Q==Glutamine, R=Arginine, S=Serine, 
T=Thrconine, V«Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










LSRKAVSDMLTACKQASFHPDVSDEVRTRALRF 

GTECTLGYLDLLEHVLVILQKPTPELKQQLAAFS 

, KRVAG A VTELIQ AAE AMKGTE WVDPEDPTVIAE 

TELLGAAASEEAAAICKLEQLKPRAKPKQADETL 

DFEEQDLEAAKSJAAATSALVK5ASAAQRELVAQ 

GKVGSIPANAADDGQWSQGLISAARMVAAATSS 

LCEAANASVQGHASEEKLISSAKQVAASTAQLL . 

VACKVKADQDSEAMRRLQAAGNAVKRASDNL 

VTIAAQKAAFGKADDDDVVVKTKFVGGIAQIIAA 

QEEMLKKERELEEARKKLAQIRQQQYKE r LPTEL 

REDEG 


3441 


A 


3 


1584 


NSARGGVGVRGARAMATVQEKAAALNLSALHS 
PAHRPPGFSVAQKPFGATYVWSSIINTLQTQVEV 
KKRRHRLKRrlNDCFVGSEAVDVEFSHLIQNKYF 
GDVDIPRAKVVRVCQALMDYKVFEAVPTKVFG 
KDKKPTFEDSSCSLYRFTTIPNQDSQLGKENKLY 
. SPARYADALFKSSDIRSASLEDLWENLSLKPANS 
PHVNISTTLSPQVINEVWQEETIGRLLQLVDLPLL 
DSLLKQQEAVPKIPQPKRQSTMVNSSNYLDRGIL 
KAYSDSQEDEWLSAAIDCLEYLPDQMWEISRSF 
PEQPDRTDLVKELLFDAIGRYYSSREPLLNHLSD 
VHNGIAELLVNGKTEIALEATQLLLKLLDFQNRE 
EFRRLLYFMAVAANPSEFKLQKESDNRMVVKRI 
FSKAIVDNKNLSKGKTDLLVLFL\MDHQKDVFKI 
PGTL\HKIVS\VK\L^4AIQNGRDPNRDAGYIYCQRI 
DQRDYSNITEKTTIDELLYLLKTLDEDSKLSAKE 
KKK\LLGQFYKCHPD1FIEHFGD 


3442 


A 


160 


822 


SPASGHCRLNGAAVAMFGCLVAGRLVQTAAQQ ■ 

VAEDKFVFDLPDYESINHVVVF3V1LGTIPFPEGMG 

GSVYFSYPDSNGMPVWQLLGFVTNGKPSAIFKIS 

GLKSGEGSQHPFGAMNIVRTPSVAQIGISVELLDS 

MAQQTPVGNAAVSSVDSFTQFTQKMLDNFYNF 

ASSFAVSQAO>DDTQ/RPSEMFIPANVVLKWYENF 

QRRTSTEPSLLENIIWnONF 


3443 


A 


.3 


1373 


SWHVRRRWLEATMAGGMKVAVSPAVGPGPWG 

SGVGGGGTVRLLLILSGCLVYGTAETDVNWML 

QESQVCEKRASQQFCYTN\n.PQWHDIWTRIQIR 

VNSSRLVRVTQVENEEKLKELEQFSIWNFFSSFL 

KEKLNDTYVNVGLYSTKTCLKVEEEKDTKYSVI 

VIRRFDPKLFLVFLLGLMLFFCGDLLSRSQIFYYS 

TGMWGIVASL\LniFILSKFMPKKSPIYVILVGGW 

SFSLYLIQLVFKNLQEIWRCYWQYLLSYVLTVGF 

MSFAVCYKYGPLENERSINLLTWTLQLMGLCFM 

YSGIQIPHI ALAKIIALCTKNLEHPIQ WL YTTCRKV 

CKGAEKPVPPRLLTEEEYRIQGEVETRKALEELR 

EFCNSPDCSAWKTVSRIQSPKRFADFVEGSSHLT 

PNEVSVHEQEYGLGSIIAQDEIYEEASSEEEDSYS 

RCPAITQNNFLT 


3444 


A 


566 


1718 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLG 

EPG\GSLGWVLPNTAMKKKVLLMGKSGSGKTS 

MRSIIFANYIARDTRRLGATTLDRIHSLQINSSLST 

YSLVDSVGNTKTFDVEHSHVRFLGNLVLNLWDC 

GGQDTFMENYFTSQRDNDFRNVEVLIYVFDVESR 

ELEKDMHYYQSCLEAILQNSPDAKIFCLVHKMD 

LVQEDQRDLIFKEREEDLRRLSRPLECSCFRTSIW 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteinc, D=Aspartic Acid, 
E=Glutamic Acid, ^-Phenylalanine, OGlycine, f^Histidine, 
I=IsoIeucine, K=Lysine, L=Lcucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
\-possibIe nucleotide insertion 










DETLYKAWSSIVYQLIPNVQQLEMNLROTAEnE 

ADEVLLFERATFLV1SHYQCKEQRDAHRFEKISNI 

IKQFKLSCSKLAASFQSMEVRNSNFAAFIDIFTSN 

TYVMVVMSDPSIPSAATLINniNARKHFEKLERV 

DGPKQCLLMR 


3445 


A 


566 


1718 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLG 

EPGNGSLGWVLPNTAMKKKVLLMGKSGSGKTS 

MRSnFANYIARDTRRLGATILDRIHSLQINSSLST 

YSLVDSVGNTKTFDVEHSHVRFLGNLVLNLWDC 

GGQDTFMENYFTSQRDNIFRNVEVLIYVFDVESR 

ELEKJ3MHYYQSCLEAILQNSPDAKIFCLVHKMD 

LVQEDQRDLIFKEREEDLRRLSRPLECSCFRTSIW 

DETLYKAWSSIVYQLIPNVQQLEMNLRNFAEIIE 

ADEVLLFERATFLVISHYQCKEQRDAHRFEKISNI 

IKQFKLSCSKLAASFQSMEVRNSNFAAFIDIFTSN 

TYVM\nmSDPSIPSAATLINIRNARKHFEKLERV 

DGPKQCLLMR 


3446 


A 


566 


1718 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLG 

EPG\GSLGWVLPNTAMKKKVLLMGKSGSGKTS 

MRSnFANYIARDTRRLG ATILDRIHSLQINS SLST 

YSLVD S VGNTTCTFD VEH SH VRFLGNL VLNL WDC 

GGQDTFMENYFTSQRDNIFRNVEVLIYVFDVESR 

ELEKDMHYYQSCLEAILQNSPDAKIFCLVHKMD 

LVQEDQRDLIFKEREEDLRRLSRPLEC SCFRTSI W 

DETLYKAWSSIVYQLIPNVQQLEMNLR>JFAEIIE 

ADEVLLFERATFLVISHYQCKEQRDAHRFEKISNI 

IKQFKLSCSKLAASFQSMEVRNSNFAAFIDIFTSN 

TYVMVVMSDPSIPSAATLmiRNARKHFEKLERV 

DGPKQCLLMR 


3447 


A 


1 


2930 


VLLGPLWDKLSTADHPVIVTMASKRKSTTPCMIP 

VKTVVLQDASMEAQPAETLPEGPQQDLPPEASA 

ASSEAAQNPSSTDGSTLANGHRSTLDGYLYSCK 

YCDFRSHDMTQFVGHMNSEHTDFNKDPTFVCSG 

CSFLAKTPEGLSLHNATCHSGEASFVWNVAKPD 

NHVVVEQSIPESTSTPDLAGEPSAEGADGQAEinT 

KTPIMKIMKGKAEAKKIHTLKENVPSQPVGEALP 

KLSTGEMEVREGDHSFINGAVPVRQASASSAKN 

PHAANGPLIGTVPVLPAGIAQFLSLQQQPPVHAQ 

HHVHQPLPTAKALPKVMIPLSSIPTYSAAMDSNS 

FLKNSFmOTYPTKAELCYLTVVTKYPEEQLKIW 

FTAQRLKQGISWSPEEffiDARKXMFNTVIQSVPQ 

PTITVLNTPLVASAGNVQHLIQAALPGHVVGQPE 

GTGGGLLVTQPLMANGLQATSSPLPLTVTSVPK 

QPGVAPINTVCSNTTSAVKVVNAAQSLLTACPSI 

TSQAFLDASIYKNKKSHEQLSALKGSFCRNQFPG 

QSEVEHLTKVTGLSTREVRKWFSDRRYHCRNLK 

GSRAMIPGDHRSIIIDSVPEVSFSPSSKVPEVTCIPT 

TATLATHPSAKRQSWHQTPDFTPTKYKERAPEQ 

LRALESSFAQNPLPLDEELDRLRSETKMTRREIDS 

WFSERRKKVNAEETKKAEENASQEEEEAAEDEG 

GEEDLASELRVSGENGSLEMPSSHILAERKVSPIK 

INLKNLRVTEANGRNEPGLGACDPEDDESNKLA 

EQLPGKVSCKKTAQQRHLLRQLFVQTQWPSNQD 

YDSMAQTGLPRPEWRWFGDSRYALKNGQLK 

WYEDYKRGNFPPGLLVIAPGNRELLQDYYMTHK 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
' corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence . 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=G!utamic Acid, F=Phenylalanine, G=GIycine, H=Histidine, 
I=Isoleucine, K-=Lysine, ^Leucine, M«Mcthionine, 
N=Asparagine, PHProIine, Q=Glutamine, R=Argininc, S=Serine, 
T=thrconinc, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=*Stop codon, ^possible nucleotide deletion, 
V=possible nucleotide insertion 










MLYEEDLQNLCDKTQMSSQQVKQWFAEKMGEE 

TRAVADTGSEDQGPGTGELTAVHKGMGDTYSE 

VSENSESWEPRVPEASSEPFDMSSPQAGRQLETD 


3448 


A 


2 


1324 " 


FVARAEKGFRTREAHLLQVAGVGTGLQNGASLS 

GLASGVMAQRAFPNPYADYNKSLAEGYFDAAG 

RLTPEFSQRLTNKIRELLQQMERGLKSADPRDGT 

GYTGWAGIAVLYLHLYDVFGDPAYLQLAHGYV 

KQSLNCLTKRSITFLCGDAGPLAVAAVLYHKMN 

NEKQAEDCITRLIHLNBQDPHAPNEMLYGRIGYIY 

ALLFVNKOTGVEKIPQSHIQQICETILTSGENLAR 

KROTTAKSPLMYEWYQEYYVGAAHGLAGIYYY 

LMQPSLQVSQGKLHSLVKPSVDYVCQLKFPSGN 

YPPCIGDNRDLLVHWCHGAPGVIYMLIQAYKVF 

R/EREKYLC\DAYQCADVIWQYGLLKKGYGLCY\ 

GSAGNAYAFLTLYNLTQDMKYLYRACKFAEWC 

LEYGEHGCRTPDT?FSLFEGMAGTIYFL\ADLLFP. 

TKAR\FPAFEL 


3449. 


A 


3 . 


2389 


SRHVTGAARSPSRAGPSDPPAMGDEDDDESCAV 

ELRITEANLTGHEEKVSVENFELLKVLGTGAYGK 

VFLWKAGGHDAGKLYAMKVLRKAALVQRAK . 

TQEHTRTERSVLELVRQAPFLVTLHYAFQTDAKL 

HLILDYVSGGEMFTHLYQRQYFKEAEVRVYGGE 

IVLALEHLHKLGIIYRDLKLENVLLDSEGHIVLTD 

FGLSKEFLTEEKERTTSFCGTIEYMAPEIIRSKTGH 

GKAVDWWSLGILLFELLTGASPFTLEGERNTQAE 

VSRRDLKCSPPFPPRIGPVAQDLLQRLLCKDPKKR 

LGAGPQGAQEVRNHPFFQGLDWVALAARKIPAP 

FRPQERSELDVGVNFAEEFTRLEPVYSPPGQ\PPPG 

DPRIFQGYSFVAPSILFDHNNAVMTDGLEAPGAG 

DRPGRAAVARSAMMQDSPFFQQYELDLREPALG 

QGSFSVCRRCRQRQSGQEFAVKILSRRLEANTQR 

EVAALRLCQSHPNVVNLHEVHHDQLHTYLVLEL 

LRGGELLEHIRKKRHFSESEASQILRSLVSAVSFM 

HEEAGVVHRDLKPENILYADDTPGAPVKIIDFG/F 

SPRLRPQSPGVPMQTPSFTLQ Y A APELLAQQGYD 

ESCDLWSLGVILY\MMLSGQAPFQGASGQGGQS 

QAAEIMCKIREGRFSLDGEAWQGVSEEAKELVR 

GLLTVDPAKRLKLEGLRGSSWLQDGSARSSPPLR 

TPDVLESSGPAVRSGLNATFMAFNRGKREGFFLK 

SVENAPLAKRRKQKLRSATASRRGSPAPANPGR 

APVASKGAPRRANGPLPPS 


3450 


A 


201 


.1705 


KGTEMNKSRWQSRRRHGRRSHQQNPWFRLRDS 

EDRSDSRAAQPAHDSGHGDDESPSTSSGTAGTSS 

WELPGFYFDPEKKRYFRLLPGHNNCNPLTKESIR 

QKEMESKRLRLLQEEDRRKKIARMGFNASSMLR 

KSQLGFLNVTNYCHLAHELRLSCMERKKVQIRS 

MDPSALASDRFNLILADTNSDRLFTVNDVTVGGS 

KYGIINLQSLKTPTLKWMHENLYFTNRKV\NSV 

CWASLNHLDSHILLCLMGLAETPGCATLLPASLF 

VNSHPAGIDRPG\MLCSFRIPGAWSCAWSLNIQA 

NNCFSTGLSRRVLLTNTWTGHRQSFGTNSDVLA 

QQFALMAPLLFNGCRSGEIFAIDLRCGNQGKGW 

KATRLFHD S A VTS VRILQDEQ YLMA SDMA GKIK 

LWDLRTTKCVRQYEGHVNEYAYLPLHVHEEEGI 

LVAVGQDCYTRIWSLHDARLLRTTPSPYPASKAD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C^ysteine, D«Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, OGlycine, H-Histidine, 
I=lsolcucine, K«Lysine, L»Leucine, M-Methionine, 
N=Asparagine, P=Proiine, Q=Glutairiinc, R=Arginine, S=Serine, 
T«Threoninc, V=VaIinc, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop cod on, /=possib!e nucleotide deletion, 
^possible nucleotide insertion 










IPSVAFSSRLGGSRGAPGLLMAVGQDLYCYSYS 


3451 


A 


19 


6033 


LLSAMLSHGAGLALWITLSLLQTGLAEPERCNFT 

LAESKASSHSVSIQWRILGSPCNFSLIYSSDTLGA 

ALGPTFRIDNTTYGCNLQDLQAGTIYNFKIISLDE 

ERTVVLQTDPLPPARFGVSKEKTTSTGLHVWWT 

PSSGKVTSYEVQLFDENNQKIQGVQIQESTSWNE 

YTFFNLTAGSKYNIAITAVSGGKRSFSVYTNGST 

VPSPVKDIGISTKANSLLISWSHGSGNVERYRLM 

LMDKGILW1GGVVDKHATSYAFHGLSPGYLYNL 

TVMTEAAGLQNYRWKLVRTAPMEVSNLKVTND 

GSLTSLKVKWQRPPG\NVDSYNITLSHKGTIKESR 

VLAPWIT\ETHFKELVPGRLY\QVTCSAVSLGELS 

AQKMXAVGRTFPDKVANLEANNNGRMRSLVVS 

WSPPAGDWEQYRDLLFNDSVVLLNITVGKEETQ 

YVMDGTGLVPGRQYEVEVIVESGNLKNSERCQG 

RTVPLAVLQLRVKHANETSLSIMWQTPVAEWEK 

YIISLADRDLLLIHKSLSKDAKEFTFTDLVPGRKY 

MATVTSISGDLKNSSSVKGRTVPAQVTDLHVAN 

QGMTSSLFTNWTQAQGDVEFYQVLLIHENVVIK 

NESISSETSRYSFHSLKSGSLYSWVTTVSGGISSR 

QVVVEGRTWSSVSGVTVNNSGRNDYLSVSWLL 

APGDVDNYEVTLSHDGKVVQSLVIAKSVRECSF 

SSLTPGRLYTVTITTRSGKYENHSFSQERTVPDKV 

QGVSVSNSARSDYLRVSWVHATGDFDHYEVTIK 

NKNNFIQTKSIPKSENECVFVQLVPGRLYSVTVT 

TKSGQYEANEQGNGRTIPEPVKDLTLRNRSTEDL 

HVTWSGANGDVDQYEIQLLFNDMKVFPPFHLVN 

TATEYRFTSLTPGRQYKILVLTISGDVQQSAFIEG 

FTVPSAVKNIfflSPNGATDSLTVNWTPGGGDVDS 

YTVSAFRHSQKVDSQTIPKHVFEHTFHRLEAGEQ. 

YQIMIASVSGSLKNQINVVGRTVPASVQGVIADN 

AYSSYSLIVSWQKAAGVAERYDELLLTENGILLR 

NTSEPATTKQHKFEDLTPGKKYKIQILTVSGGLFS 

KEAQTEGRTVPAAVTDLRITENSTRHLSFRWTAS 

EGELSWYNIFLYNPDGNLQERAQVDPLVQSFSFQ 

NLLQGRMYKMVIVTHSGELSNESFIFGRTVPASV 

SHLRGSNRNTTDSLWFNWSPASGDFDFYELILYN 

PNGTKKENWKDKDLTEWRFQGLVPGRKYVLW 

WTHSGDLSNKVTAESRTAPSPPSLMSFADIANT 

SLAITWKGPPDWTOYNDFELQWLPRDALTVFNP 

YNNRKSEGRIVYGLRPGRSYQFNVKTVSGDSWK 

TYSiCPIFGSVRTKPDKIQNLHCRPQNSTAIACSWI 

PPDSDFDGYSffiCRKMDTQEVEFSRKLEKEKSLL 

NIMMLVPHKRYLVSIKVQSAGMTSEVVEDSTIT 

MTORPPPPPPHIRVNEKDVLISKSSIN^^ 

DTNGAVKYFTWVREADGSDELKPEQQHPLPSY 

LEYRHNASIRVYQTNYFASKCAENPNSNSKSFNI 

KLGAEMESLGGKCDPTQQKFCDGPLKPHTAYRI 

SIRAFTQLFDEDLKEFTKPLYSDTFFSLPITTESEP 

LFGAffiGVSAGLFLIGMLVAWALLICRQKVSHG 

RERPSARLSIRRDRPLSVHLNLGQKGNRKTSCPIK 

INQFEGHFMKLQADSNYLLSKEYEELKDVGRNQ 

SCDIALLPENRGKNRYNNELPYDATRVKLSNVDD 

DPCSDYINASYIPGNNFRREYIVTQGPLPGTKDDF 

WKMVWEQNVHNIVMVTQCVEKGRVKCDHYW 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location : 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, OGiycine, H^Htstidine, 
I^lsolcucine, K^Lysine, L*=Leucine, M— Methionine, 
N«Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T«Threonine, V= Valine, W«=Tryptophan, Y=Tyrosine, 
X-Unknown, *= s Stop cod on, /-possible nucleotide deletion, 
V»possible nucleotide insertion 










PADQDSLYYGDLILQMLSESVLPEWTIREFKICGE 

EQLDAHRLIRHFHYTVWPDHGVPETTQSLIQFVR 

TVRDYINRSPGAGPTVVHCSAGVGRTGTFIALDR 

ILQQLDSKDSVDIYGAVVHDLRLHRVHMVQTEC 

QYVYLHQCVRDVLRARKLRSEQENPLFPIYENV 

NPEYHRDPVYSRH 


3452 


A 


63 


1073 


FFRSSSDNGSPIRQYE/HSTPAHQGPVMGLEGKS/ 

ARNSQLRIVLVGKTGAGKSATGNSILGRKVFHSG 

TAAKSITKKCEKRSSSWKETELVVVDTPGEFDTE 

VPNAETSKEIIRCILLTSPGPHALLLVVPLGRYTEE 

EHKATEKILKMFGERARSFMILIFTRICDDLGDTN 

LHDYLREAPEDIQDLMDIFGDRYCALNNKATGA 

EQEAQRAQLLGLIQRV VRENKEGC YTNRMY QR 

AEEEIQKQTQAMQELHRVELEREKARIREEYEEK 

IRKLEDKVEQEKRXKQMEKKLAEQEAHYAVRQ 

QRARTEVESKDGELELIMTALQIASFILLRLFAED 


3453 


A 


2674 


514 


GPITFLKKKAKMKDMPLRIHVLLGLAITTLVQAV 
DKXVDCPRLCTCEIRPWPTPRSIYMEASTVDCN^ 
LGLLTFPARLPANTQILLLQTNNIAKIEYSTOFPV 
NLTGLDLSQNNLSSVTNINGKKMPQLLSVYLEEN ■ 
KLTELPEKCLSELSNLQELYINHNLLSTISPGAFIG - 
LHNLLRLHLNSNRLQMINSKWFDALPNLEILMIG 
ENPIDUKDMNFKPLINLRSLVIAGINLTEIPDNAL 
VGLENLESISFYDNRLIKVPHVALQKWNLKFLD 
LNKNPINRIRRGDFSNMLHLKELGINNMPELISED 
SLAVDNLPDLRKIEATNNPRLSYIHPNAPFRLPKL 
ESLMLNSNALSALYHGTIESLPNLKEISIHSNPIRC 
DCVIRWMhMNKTNIRFMEPDSLFCVDPPEFQGQ 
NVRQVHFRDMMEICLPL1APESFPSNLNVEAGSY 
. VSFHCRATAVEPQPEIYWITPSGQKLLPN'RLTDKF . 
YVHSEGTLDINGVTPKEGGLYTCIATNLVGADLK 
SVM1KVDGSFPQDNNGSLNIKIRDIQANSVLVSW 
KASSK1LKSSVKWTAFVKTENSHAAQSARIPSDV 
KVYNLTHLNPSTEYKICmiPTIYQKNRKXCVWT 
TKGLHPDQKEYEKNNTTTLMACLGGLLGIIGVIC 
LISCLSPEMNCDGGHSYVRNYLQKPTFALGELYP 
PLINLWEAGKEKSTSLKVKATVIGLPTNMS 


3454 


A 


1844 


244 


ERYLFATYVAPSATLDIGLQQEKKKEIYMKIQPP . 

FEDLFDTAEEYILLLLLEPWTKMVKSDQIAYKKV 

ELVEETRQLDSTYFRI<LQALHKETFSKKAEDTTC 

EIGTGILSLSlsrVSKRTE YWDNVPAEYKHFKFSDL 

LNNKLEFEHFRQFLETHSSSMDLMCWTDIEQFRR 

ITYRDRNQRKAKSIYIKNKYLNKKYFFGPNSPAS 

LYQQNQVMHLSGGWGKILHEQLDAPVLVEIQK 

HVQNRLENVWLPLFLASEQFAARQKIKVQMKDI 

AEELLLQKAEKKIGVWKPVESKWISSSCK11AFRK 

ALLNPVTSRQFQRFVALKGDLLENGLLFWQEVQ 

KYKDLCHSHCDESVIQKKITTIINCFINSSIPPALQI 

DIPVEQ AQKJIEHRKELGP Y VFREA QMTFLG VMF 

• KF WPQFCEFRKNLTDENIMS VLERRQEYNKQKK 

KLAVL/QNDEKSGKDGIKQYANTSVPAIKTALLS 

DSFLGLQPYGRQPTWCYSKYIEALEQERILLKIQE 

ELEK\SCLQACNLSQILRLALQLCL 


3455 


A 


228 . 


3330 


APTAQAMMSFGGADALLGAPFAPLHGGGSLHY 
ALARKGGAGGTRSAAGSSSGFHSWTRTSVSSVS 
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SEQID 
NO: 


Method 


Predicted 
beginning 

n ii/*li»r»f iHp 

Jl ULICUlluC 

, location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=G)utamic Acid, ^Phenylalanine, G^GIycine, H=Histidine, 
* isiHcuuiii.) rv— jujmiic, jj^ijCuciiiC) itj. - iricinionine, 
N^Asparagine, P=ProIine, Q=G)utamine, R^Arginine, S=Serine, 
T^Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










ASPSRFRGAGAASSTDSLDTLSNGPEGCMVAVA 
TSRSEKEQLQALNDRFAGYIDKVRQLEAHNRSLE 
GEAAALRQQQAGRSAMGELYEREVREMRGAVL 
RLGAARGQLRLEQEHLLEDIAHVRQRLDDEARQ ' 
REEAEAAARALARFAQEAEAARVDLQKKAQAL 
QEECGYLRRHHQEEVGELLGQIQGSGAAQAQM 
QAETRDALKCDVTSALREIRAQLEGHAVQSTLQ 
SEEWFRVRLDRLSEAAKVNTDAMRSAQEEITEY 
RRQLQARTTELEALKSTKDSLERQRSELEDRHQA 
DIASYQEAIQQLDAELRNTKWEMAAQLREYQDL 
LNVKMALDIEIAAYRKLLEGEECRIGFGPIPFSLP 
EGLPKIPSVSTHIKVKSEEKIKWEKSEKETVIVEE - 
QTEETQVTEEVTEEEDKEAKEEEGKEEEGGEEEE 
AEGGEEETKSPPAEEAASPEKEAKSPVKEEAKSP 
AEAKSPEKEEAKSPAEVKSPEKAKSPAKEEAKSP 
PEVAKSPEKDGKQNFQAEVKSPEKAKSPAKEEAK 
SPAEAKSPEKAKSPVKEEAKSPAEAKSPVKEEAK 
SPAEVKSPEKAKSPTKEE\AKSPEKAKSPEKAKSP 
EKEEAKSPEKAKSPVKAEAKSPEKAKSPVKAEA 
KSPEKAKSPVKEEAKSPEKAKSPVKEEAKSPEKA 
KSPVKEEAKTPEKAKSPVKEEAKSPEKAKSPEKA 
. KTLDVKSPEAKTPAKEEARSPADKFPEKAKSPVK 
EEVKSPEKAKSPLKEDAKAPEKEIPKKEEVKSPV 
KEEEKPQEVKVKEPPKKAEEEKAPATPKTEEKK 
DSKKEEAPKKEAPKPKVEEKKEPAVEKPKESKV 
EAKKEEAEDKKKVPTPEKEAPAKVEVKEDAKPK 
EKTEVAKKEPDDAKAKEPSKPAEKKEAAPEKKD 
TKEEKAKKPEEKPKTEAKAKEDDKTLSKEPSKP 
KAEKAEKSSSTDQKDSKPPEKATEDKAAKGK 


3456 


A 


258 


1463 


YLSFIPGHASKSAPMNGHCFAENGPSQKSSLPPLL 

IPPSENLGPHEEDQVVCGFKKLTVNGVCASTPPL 

TPIKNSPSLFPCAPLCERGSRPLPPLPISEALSLDDT 

DCEVEFLTSSDTDFLLEDSTLSDFKYDVPG\RRSF 

RGCGQINYAYFDTPAVSAADLSYVSDQNG\GVP 

DPNPPPPQTHRRLRRSHSGPAGSFNKPAIRISNCCI 

HRASPNSDEDKPEVPPRVPIPPRPVKPDYRRWSA 

EVTSSTYSDEDRPPKVPPREPLSPSNSRTPSPKSLP 

SYLNGVMPPTQSFAPDPKYVSSKALQRQNSEGS 

ASKYPCE.PIIENGKKVSSTHYYLLPERPPYLDKY 

EKFFREAKKKNGGAQIQPLPADCGISSATEKPDS 

KTKMDLGGHVKRKHLSYVGTP 


3457 


A 


2 


4869 


FILSSSSSASSEHFHHHYSFGNWWPGSFKGHRMS 

LPFYQRCHQHYDLSYRNKDVRSTVSHYQREKKR 

SAVYTQGSTAYSSRSSAAHRRESEAFRRASASSS 

QQQASQHALSSEVSRKAASAYDYGSSHGLTDSS 

LLLDDYSSKLSPKPKRAKHSLLSGEEICENLPSDY 

MVPIFSGRQKHVSGITDTEEERIICEAAAYIAQRNL 

LASEEGITTPKQSTASKQTTASKQSTASKQSTASK 

QSTASRQSTASRQSVVSKQATSALQQEETSEKKS 

RKVVIRGKAERLSLRKTLEETETYHAKLNEDHLL 

HAPEFIDCPRSHTVWEKENVKLHCSIAGWPEPRV 

TWYKNQVPINVHANPGKYIIESRYGMHTLEINAC 

DFEDTAQYRASAMNVKGELSAYASVVVKRYKG 

EFDETRFHAGASTMPLSFGVTPYGYASRFEIHFD 

DKFDVSFGREGETMSLGCRWITPEIKHFQPEIQ 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=A 1 an i ne C=Cy stein e, D=Aspartic Acid, 
E=Glutamic Acid, F=Pheny (alanine, G=Glycine, H=Histidine, 
I=Isoleucitie, K.— Lysine, L/= Leu cine, M— Methionine, 
N=Asparagine, P«=Proline, Q=GIutaraine, R=Arginine, S=Serine, 
"^Threonine, V^Valine, W=Tryptophan, Y^Tyrosinc, 
X~Unknown, *=Stop codon, /=possib!e nucleotide deletion, 
^possible nucleotide insertion 










WYRNGVPLSPSKWVQTLWSGERATLTFSHLNKE 

DEGLYmVRMGEYYEQYSAYVFVRDADAEIEG 

APAAPLDVKCLEANKDYinSWKQPAVDGGSPIL 

GYFEDKCEVGTDSWSQCNDTPVKFARFPVTGLIE 

GRSYIFRVRAVNKMGIGFPSRVSEPVAALDPAEK 

ARLKS/PPLSTLDWTWIVTEEEPSEGIVPGPPTDLS 

VTEATRSYVVLSWKPPGQRGHEGIMYFVEKCEA 

GTENWQRVNTELPVKSPRFALFDLAEGKSYCFR 

VRCSNSAGVGEPSEATEVTVVGDKLDIPKAPGKI 

iPSRNTDTSVVVSWEESKDAKELVGYYIEANVA 

GSGKWEPCNNNPVKTHRFTCHGLVTGQSYIFRV 

RAVNAAGLSEYSQDSEAIEVKAAIAPPSPPCDITC 

LESFRDSMVLGWKQPDKIGGAEITGYYVNYREV 

IDGVPGKWREANVKAVSEEAYKISNLKENMVY 

QFQVAAMNMAGLGAPSAVSECFKCEEWT1AVP 

GPPHSLKCSEVRKDSLVLQWKPPVHSGRTPVTG 

YFVDLKEAKAKEDQWRGLNEAAIKNVYLKVRG 

LKEGVSYVFRVRAINQAGVGKPSDLAGPVVAET 

RPGTKEVWNVDDDGVISLNFECDKMTPKSEFS 

WSKDYVSTEDSPRLEVESKGNKTKMTFKDLGM 

DDLGIYSCDVTDTDGIASSYLIDEEELKRLLALSH 

EHKFPTVPVKSELAVEILEKGQVRF\WMQAEKLS 

GNAKVNYEFNEKGIFEGPKYKMHIDRNTGIIEMF 

MEKLQDEDEGTYTFQLQDGKATNHSTVVLVGD 

V7KKLQKEAEFQRQEWIRKQGPHFVEYLSWEVT 

GECNVLLKCKVA>nXKETHIVWYKDEREISVDE 

KHDFKDGICTLLITEFSKXDAGIYEVILKDDRGK 

DKSRLKLVDEAFKELMMEVCKKIALSATDLKIQ. 

STAEGIQLYSFVTYYVEDLKVNWSHNGSAIRYSD 

RVKTGVTGEQIWLQINEPTPNDKGKYVMELFDG 

KTGHQKTVDLSGQAYDEAYAEFQRLKQAAIAEK 

NRARVLGGLPDVVTIQEGKALNLTCNV WGDPPP 

EVSWLKNEKALASDDHCNLKFEAGRTAYFTING 

VSTADSGKYGLWKNKYGSETSDFTVSVFPEEE 

ARMAALESLKGGKKAK 


3458 


A . 


3963 


827 


LSRSSSDNNTNTLGRNVMSTATSPLMGAQSFPNL 
TTPGTTSTVTMSTSSVTSSSNVATATTVLSVGQS 
LSNTLTTSLTSTSSESDTGQEAEYSLYDFLDSCRA 
STLLAELDDDEDLPEPDEEDDENEDDNQEDQEY . 
EEVMILRRPSLQRRAGSRSDVTHHAVTSQLPQVP 
AGAGSRPIGEQEEEEYETKGGRRRTWDDDYVLK 
RQFSALVPAFDPRPGRTNVQQTTDLEIPPPGTPHS 
. ELLEEVECTPSPRLALTLKVTGLGTTREVELPLTO 
FRSTD^VQKIXQLSCNGNVKSDKLRRIWEPTY 
TIMYREMKDSDKEKENGKMGC WSIEHVEQYLG 
TDELPKNDLITYLQKNADAAFLRHWKLTGTNKS 
IRKNRNCSQLIAAYWDLG\EHGTK\SGLNQGAIST 
LQSSDILNLTKEQPQAKAGNGQNSCGVEDVLQL 
LRHYrVASDPYSRISQEDGDEQPQFTFPPDEFTS/ 
BCKITTKILQQIEEPLALASGALPDWCEQLTSKCPF 
LIPFETRQLYFTCTAFGASRAIVWLQNRREATVE 
RTRTTSSVRRDDPGEFRVGRLKHERVKVPRGESL 
MEWAENVMQIHADRKSVLEVEFLGEEGTGLGPT 
LEFYALVAAEFQRTDLGAWLCDDNFPDDESRHV 
DLGGGLKPPGYYVQRSCGLFTAPFPQDSDELERI 
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SEQ II> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A s AIanine OCysteine, D=Aspartic Acid, 
E=Clutamic Acid, F=Phenyla!aninc, G=Glycine, H=Histidine, 
I-Isoleucinc, KHLysine, L^Leucine, M=Methionine, 
N-Asparagine, P^Prolinc, Q=Glutaminc, R^Arginine, S=Serine, 
T=Threonine, V«Valine, W«Tryptophan, Y^Tyrosine, 
XHJnknown, *=Stop cod on, A=possibJe nucleotide deletion, 
\=possible nucleotide insertion 










TKLFHFLGIFLAKCIQDNRLVDLPISKPFFKLMCM 

GDIKSNMSKLIYESRGDRDLHCTESQSEASTEEG 

HDSLSVGSFEEDSKSEFDLDPPBCPKPPAWFNGILT 

WEDFELVNPHRARFLKEIKDLAIKRRQILSNKGL 

SEDEKNTKLQELVLKNPSGSGPPLSIEDLGLNFQF 

CPSSRIYGFTAVDLKPSGEDEMITMDNAEEYVDL 

MFDFCMHTGl QKQMEAFRDGFNKVFPMEKLS SF 

SHEEVQMILCGNQSPSWAAEDIINYTEPKLGYTR 

DSPGFLRFVR\^CGMSSDERKAFLQFTTGCSTLP 

PGGLANLHPRLTVVRKVDATDASYPSVNTCVHY 

LKLPEYSSEEIMRERLLAATMEKGFHLN 


3459 


A 


88 


603 


SCGPRGLASLGLGFSGRCDDQNKGRS\DGPEAQA 

EACSGERTYQELLVNQNPIAQPLASRRLTRKLYK 

CIKKAVKQKQ1RRGVKEVQKFVNKGEKGIMVLA 

GDTLPIEVYCHLPVMCEDRNLPYVYIPSKTDLGA 

AAGSKRPTCVIMVKPHEEYQEAYDECLEEVQSL 

PLPL 


3460 


■A 


139 


1997 


QVTNMSDKSELKAELERKKQRLAQIREEKKRKE 

EERKKKETDQKKEAVAPVQEESDLEKKRREAEA 

LLQSMGLTPESPIVPPPMSPSSKSVSTPSEAGSQD 

SGDGAVGSRRGPIKLGMAKITQVDFPPREIVTYT 

KETQTPVMAQPKEDEEEDDD VVAPKPPIEPEEEK 

TLKKDEEN^)SKAPPHELTEEEKQQILHSEEFLSFF 

DHSTRIVERALSEQINIFFDYSGRDF/ENDKEGEIQ 

AGAKLSLNRQFF\DER\WSKASGWVSCLDWSSQ • 

YP\ELLVASYNNNEDAPHEPDGVALVWNMKYK 

KTTPEYVFHCQSAVMSATFAKFHPNLVVGGTYS 

GQIVLWDNRSNKRTPVQRTPLSAAAHTHPVYC V 

NWGTQNAHNLISISTDGKICSWSLDMLSHPQDS 

MELVHKQSKAVAVTSMSFPVGDVNNFWGSEE 

GSVYTACRHGSKAGISEMFEGHQGPITGIHCHAA 

VGAVDFSHLYVTSSrTDWTVKLWTTKNKKPLYSF 

EDNAGYVYDVMWSPTHPALFACVDGMGRLDL 

WNLNNDTEVPTA SIS VEGNP ALNRVR WTHS GRE 

IAVGDSEGQIVIYDVGEQIAVPRNDEWARFGRTL 

AEINANRADAEEEAATRDPA 


3461 


A 


139 


1997 


qvt^sdkselkaelerkkqrlaqireekkrke 
eerkkketoqkkeavapvqeesdlekkrreaea 
llqsmgltpespivpppmspssksvstpseagsqd 
sgdga vgsrrgpiklgmakitqvdfppreivtyt 
ketqtpvmaqpkedeeedddvvapkppiiepeeek 
tlkkdeen\dskapphelteeekqqilhseeflsff 
dhstriveralseqin1ffdysgrdf/endkegeiq - 
agaklslnrqff\der\wskasgwvscldwssq 
yp\ellvasynnnedaphepdgvalvwnmi<:yk 
kttpeyvfhcqsavmsatfakfhpnlvvggtys 
gqivlwdnrsnkrtpvqrtplsaaahthpvycv 
nvvgtqnahnlisistdgkicswsldmlshpqds 
* melvhkqskavavtsmsfpvgdvnnfvvgsee 
gsvytacrhgskagisemfeghqgpitgihchaa 

VGAVDFSHLWTSSFDWT\^WTTTC>INKPLYSF 

EDNAGYVYDVMWSPTHPALFACVDGMGRLDL 

\\^NM)TEVPTASISVEGNPALNRVRWTHSGRE 

UVGDSEGQIVIYDVGEQIAVPRNDEWARFGRTL 

AEINANRADAEEEAATRIPA 
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SEQID 
NO: 


Method. 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide ' 
sequence . 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=»Lysine, Lr=Leucine, M^Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Argininc, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X«Unknown, *= s£ Stop codon, /^possible nucleotide deletion, 
V=possibIe nucleotide insertion 


3462 


A 


2 


2643 


TAPEFSRSTHASAHASVARVLRNREIAQLKKEQR 

RQEFQIRALESQKRQQEMVLRRKTQEVSALRRL 

AKPMSERVAGRAGLKPPMLDSGAEVSASTTSSE 

AESGARS VSSWRQWNRKINHFLGDHPAPTVNGT 

RPARKKFQKKGASQSFSKAARLKWQSLERRITDI 

VMQRMTIVI^EADIVDBRLIKKREELFLLQEALRR 

KRERLQ AESPEEEKGLQELAEEIEVLAANID YIND 

GITDCQATIVQLEETKEELDSTDTSWISSCSLAE 

ARLLLDNFLKASEDKGLQVAQKEAQIRLLEGRLR 

QTDMAGSSQNHLLLDALREKAEAHPELQALIYN 

VQQENGYASTDEEISEFSEGSFSQSFTMKGSTSH 

DDFKFKSEPKLSAQMKAVSAECLGPPLDISTKNI 

TKSLASLVEIKEDGVGFSVRDPYYRDRVSRTVSL 

PTRGSTFPRQSRATETSPLTRRKSYDRGQPIRSTD 

VGFTPPSSPPTRPRNDRNVFSRLTSNQSQGSALD 

KSDDSDSSL\SEVLRGIISPVGGAKGARTAPLQCV 

SMAEGHTKPILCLDATDELLFTGSKDRSCKMWN 

LVTGQEIAALKGHPNNVVSIKYCSHSGLVFSVST 

SYIKVWDIRDSAKCIRTLTSSGQVISGDACAATST 

RAITSAQGEHQINQIALSPSGTMLYAASGNAVRI 

WELSRFQPVGKLTGHIGPVMCLTVTQTASQHDL 

WTG SKDHYVKMFELGECVTGTIGPTHNFEPPH . 

YDGIECLAIQGDILFSGSRDNGIKKWDLDQQELIQ 

QIPNAHKDWVCALAFIPGRPMLLSACRAGVIKV 

WNVDNFTPIGEIKGPIDSPINAICTNAKHIFTASSG 

CRVKVWNYVPGLTPCLPRRVLAIKGRATTLP 


3463 


A 


198 


3146 


SGEPRPEPGNMATCIGEKIEDFKVGNLLGKGSFA 
GVYRAESIHTGLEVAIKMroKKA^mCAGMVQR 
VQNEVKIHGQLKHPSILELYNYFEDSNYVYLVLE 
- MCHNGEMNRYLKNRVKPFSENEARHFMHQIITG 
MLYLHSHGILHRDLTLSNLLLTRMvlNIKJADFG 
ATQLKMPHEKHYTLCGTPNYISPEIATRSAHGLE 
SDVWSLGCMF^TLLIGRPPFDTDTVKNTLNKVV 
L AD YEMPTFL S1E AKDLIHQLLRRNP ADRLS L S S V 
LDHPFMSRNSSTKSKDLGTVEDSIDSGHATISTAI 
TASSSTSISGSLFDKilRLLIGQPLPNKMTVFPKNK 
SSTOFSSSGDGNSFYTQWGNQETSNSGRGRVIQD 
AEERPHSRYLRRAYSSDRSGTSNSQSQAKTYTM 
ERCHSAEMLSVSKRSGGGENEERYSPTDNNAMF 
NFFKEKTSSSSGSFERPDNNQALSNHLCPGKTPFP 
FADPTPQTETVQQWFGNLQINAHLRJCTTEYDSIS 
PNRDFQGHPDLQKDTSKNAWTDTKVKKNSDAS : 
DNAHSVKQQNTMKYMTALHSKPEUQQECVFGS 
DPLSEQSKTRGMEPPWGYQNRTLRSITSPLVAHR 
LKPIRQKTKKAVVSILDSEEVCVELVKEYASQEY 
VKEVLQISSDGNTITIYYPNGG\RGFPLA\DRPPSP 
TVDNISR\YSF\DNLPEKYWRKYQYASRFVQLVRS 
KSPKITYFTRYAKCILMENSPGADFEVWFYDGV 
KIHKTEDFIQVIEKTGKSYTLKSESEVNSLKEEIK - 
MYMDHANEGHRICLALESHSEEERKTRSAPFFPn 
IGRKPGSTSSPKALSPPPSVDSNYPTRDRASFNRM 
VMHSAASPTQAPILNPSMVTNEGLGLTTTASGTD 
ISSNSLKDCLPKSAQLLKSVFVKNVGWATQ\LTS 
GAVWVQFNDGSQLVVQAGVSSISYTSPNGQ\TTR 
\YGENEKLPDYIKQKLQCLSSILLMFSNPTPNFH 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 

Inrnf inn 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIaninc OCysteine, D= Aspartic Acid, 
E=Glutamic Acid, F=Pheny (alanine, G=Glycine, H=Histidine, 
a isuiLuuiic, IV — i^yaiiic, L/ — i^cuci lie, ivi— jvietnioninc, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T^Tlirconine, V=VaIine, W=Tryptophfln, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possiblc nucleotide insertion 


3464' 


A 


14 


348 


AVRTVSGTSLGPRSHSRSPGRCHCFSAVTFSSPRL 
AASEAPDPMEEWDVPQMKKEVESLKYQLAFQR 
EMA SKTIPELLK WIEDGIPKDPFLNPDLMKKNP W 
V\EKGKCTIL 


3465 


A 


5537 


405 


VRKLDRERVGAWWRGAWARHPRQEAGEHAKR 

RKGHAETPRGRRKGRAGRSAAAVGELRPARRSL 

ETSRAAAAMAKDSPSPLGASPKKPGCSSPAAAV 

LENQRRELEKLRAELEAERAGWRAERRRFAARE 

RQLREEAERERRQLADRLRSKWEAQRSRELRQL 

QEEMQREREAEIRQLLRWKEAEQRQLQQLLHRE 

RDGVVRQARELQRQLAEELVNRGHCSRPGASEV 

SAAQCRCRLQEVLAQLRWQTDGEQAARIRYLQ 

AALEVERQLFLKYILAHFRGHPALSGSPDPQAVH 

SLEEPLPQTSSGSCHAPKPACQLGSLDSLSAEVG 

VRSRSLGLVSSACSSSPDGLLSTHASSLDCFAPAC 

SRSLDSTRSLPKASKSEERPSSPDTSTPGSRRLSPP 

PSPLPPPPPPSAHRKLSNPRGGEGSESQPCEVLTPS 

PPGLGHHELIXLNWLLAKALWVLARRCYTLQEE 

NKQLRRAGCPYQADEKVKRLKVKRAELTGLAR 

RLADRARELQETOLRAVSAPEPGESCAGLELCQV 

FARQRARDLSEQASAPLAKDKQIEELRQECHLLQ 

ARVASGPCSDLHTGRGGPCTQWLNVRDLDRLQ 

RESQREVLRLQRQLMLQQGMGGAWPEAGGQSA 

TCEEVRRQMLALERELDQRRRECQELGAQAAPA 

RRRGEEAETQLQAALLKNAWLAEENGRLQAKT 

DWVRKVEAENSEVRGHLGRACQERDASGLIAEQ 

LLQQAARGQDRQQQLQRDPQKALCDLHPSWKEI 

QALQCRPGHPPEQPWETSQMPESQVKGSRRPKF 

HARAEDYAVSQPNRDIQEKREASLEESPVALGES 

ASVPQVSETVPASQPLSKKTSSQSNSSSEGSMWA 

TVPSSPTLDRDTASEVDDLEPDSVSLALEMGGSA 

APAAPKLKIFMAQYNYNPFEGPNDHPEGELPLTA 

GDYIWGDMDEDGFYEGELEDGRRGLWSNFVE 

QIPDSYIPGCLPAKSPDLGPSQLPAGQDEALEEDS 

LLSGKAQGWDRGLCQlNdVRVGSKTEVATEILDT 

KTEACQLGLLQSMGKQGLSRPLLGTKGVLRMAP 

MQLHLQKVTATSANITWWSSHRHPHVVYLDD 

REHALTPAGVSCYTFQGLCPGTtnTvARVEVRLP 

RDLLQVYWGTMSSTVTFDTLLAGPPYPPLDVLV 

ERHASPGVLWSWLPVTIDSAGSSNGVQVTGYA 

VYADGLKVCEVADATAGSTLLEFSQLQVPLTWQ 

KVS VRTMSLCGESLDS VP AQIPEDFFMCHRWPET 

PPFSYTCGDPSTYRVTFPVCPQICLSLAPPSAKASP 

HNPGSCGEPQAKFLEAFFEEPPRRQSPVSNLGSE 

GECPSSGAGSQAQELAEAWEGCRKDLLFQKSPQ 

NHRPPSVSDQTGEKENCYQHMGTSKSPAPGFIHL 

RTECGPRKEPCQEKAALERVLRQKQDAQGFTPP 

QLGASQQYASDFHNVLKEEQEALCLDLWGTERR 

iEERREPEPHSRQGQALGVKRGCQLHEPSSALCPA 

PSAKVIKMPRGGPQQLGTGANTPARVFVALSDY 

NPLVMSAMLKAAEEELVFQKRQLLRVWGSQDT 

HDFYLSECNRQVGNIPGRLVAEMEVGTEQTDRR 

WRSPAQGHLPSVAHLEDFQGLTIPQGSSLVLQGN 

SKRLPLWTPKIMIAALDYDPGDGQMGGQGKGRL 

ALRAGDVVMVY\GPMDDQGFYYGELGGHRG\L 



346 



WO 01/57190 



PCT/US01/04098 



SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, ^Phenylalanine, G=Glycint s H=Histidine, 
I=lsoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, QKJlutamine, R^Arginine, S=Serine, 
T^Thrconine, V^Valinc, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\~possible nucleotide insertion 










VPANLRKMSSQGH 


3466 


A 


1 


1111 


MSKPPDLLLRLLRGAPRQRVCTLFIIGFICFTPFVSI 

MIYWHVVGEPKEKGQLYNLPAEIPCPTLTPPTPP 

SHGPTPGNIFFLETSDRTNPNFLFMCSVESAARTH 

PESHVLVLMKGLPGGNASLPRHLGISLLSCFPNV 

QMLPLDLRELFRDTPLADWYAAVQGRWEPYLL 

PVLSDASRLALMWKFGGIYLDTDFIVLKNLRNLT 

NVLGTQSRYVLNGAFLAFERRHEFMALCMRDFV 

DHYNGWIWGHQGPQLLTRVFKKWCSIRSLAESR 

ACRGVTTLPPEAFYP1PWQDWKKYFEDTNPEELP 

RLLSATYAVHVWNKKSQGTRFEATSRALLAQLH 

ARYCPTTHE/DHENVLVKGPAGHLPNLLLMGHW 


3467 


A 


1 


2175 • : 


MAKVILKQSKQCKNLLTCKVAQVCPVCGCLHC 

YFWWLSGLESRRPSSPLID1KPIEFGVLSAKKEPIQ 

PSVLRRTYNPDDYFRKFEPHLYSLDSNSDDVDSL 

TDEEILSKYQLGMLHFSTQYDLLHNHLTVRVIEA - 

RDLPPPISHDGSRQDMAHSNPYVKICLLPDQKNS 

KQTGVKRKTQKPVFEERYTFEIPFLEAQRRTLLL 

TVVDFDKFSRHCVIGKVSVPLCEVDLVKGGHW 

WKAHDSQFSAPGLPADQQFFADLFSGLVLNPQL 

LGRVWFASQPASLPVGSLCIDFPRLDIVLRGEYG 

NLLEAKQQRL\^GEMLFIPARAANLPVNNKPVM 

LLSLVFAPTWLGLSFYDSRTTSLLHPARQIQLP\SL 

QRGEGEAMLS\ALTLFSRSPLEQNIIQPLVLSLLHL 

CGSWNMPPGNSQPRGDFLYHS1CTWVQDNYAQ 

PLTRESVAQFFNITPNHLSKLFAQHGTMRFIEYVR 

WVRMAKARMILQKYHLSIHEVAQRCGFPDSDYF 

CRVFRRQFGMDYVDILQIHRWDYNTPIEETLEAL 

NDWKAGKARYIGASSMHASQFAQALELQKQH 

GWAQFVSMQDHYNLIYREEEREMLPLCYQEGV 

AVIPWSPLARGRLTRPWGETTARLVSDEVGKNL 

YKESDENDAQIAERLTGVSEELGATRAQVALAW 

LLSKPGIAAPEGTSREEQLDELLNAVDITLKPEQI 

AELETPYKPHPVVGFK 


3468 


A 


147 


3209 


ALPLPLPTLYPGMSRRKQRKPQQLISDCEGPSASE 

NGDASEEDHPQVCAKCCAQFTDPTEFLAHQNAG 

STDPPVMVIIGGQENPNNSSASSEPRPEGHNNPQ 

VMDTEHSNPPDSGSSVPTDPTWGPERRGEESSGH 

FLVAATGTAAGGGGGLELASPKLGATPLPPESTP 

APPPPPPPPPPPGVGSGHLNIPLILEELRVLQQRQr 

HQMQMTEQICRQVLLLGSLGQTVGAPASPSELP 

GTGTASSTKPLLPLFSPIKPVQTSKTLASSSSSSSS 

SSGAETPKQAFFHLYHPLG SQHPFSAGG VGRSHK . 

PTPAPSPALPGSTDQLIASPHLAFPSTTGLLAAQC 

LGAARGLEATASPGLLKPKNGSGELS YGEVMGP 

LEKPGGRHKCRFCAKVFGSDSALQMLRSHTGER 

PYKCNVCGNRFTTRGNLKVHFHRHREKYPHVQ 

MNPHPVPErlLDYVITSSGLPYGMSVPPEKAEEEA 

ATPGGGVERKPLVASTTALSATESLTLLSTSAGT 

ATAPGLPAFNKFVLMKAVEPKNKADENTPPGSE 

GSAISGVAESSTATRMQLSKLVTSLPSWALLTOH 

FKSTGSFPLPLCARALG\ASPSETSKLQQLVEKID 

RQGAVAVTSAASGAPTTSAPAPSSSASSGPNQCV 

ICLRVLSCPRALRLHYGQHGGERPFKCKVCGRAF 

STRGNLRAHFVGHKASPAARAQNSCPICQKKFT 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location * 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A*=Alanine C=Cysteine, D^Aspartic Acid, 
E=Glutamic Acid, F«Phcnylalaninc, OGIycine, H^Histidine, 
I-Isoleucine, K-Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q^Glutamine, R*=Arginine, S=Serine 1 
T^Threonine, V^Valine, W=Tryptophan, V=Tyrosine, 
X^Unknown, *=Stop codon, /-possible nucleotide deletion, 
V=possibIe nucleotide insertion 










NAVTLQQHVRMHLGGQEPNGGTALPEGGGAAQ 

ENGSEQSTVSGAGSFPQQQSQQPSPEEELSEEEEE 

EDEEEEEDVTDEDSLAGRGSESGGEKA1SVRGDS 

EEASGAEEEVGTVAAAATAGKEMDSNEKTTQQS 

SLPPPPPPDSLDQPQPMEQGSSGVLGGKEEGGKP 

ERSSSPASALTPEGEATSVTLVEELSLQEAMRKEP 

GESSSRKACEVCGQAFPSQAAL\EEH\QKTHPKEG 

PLRTCWCRQGFLERAmCKHMLLAHHQVQPFA 

PHGPQNIAALSLVPGCSPSITSTGLSPFPRKDDPTI 

P 


3469 


A 


3 


5664 


NLRPLSF ALFLGDPNMANLEESFPRGG TRKIHKP 
EKAFQQSVEQDNLroiSTEEGSTKRKKSQKGPAK 
TKKLKIEICRESSKSAREKFEILSVESLCEGMRILG - 
CVKEVNELELVISLPNGLQGFVQVTEICDAYTKK 
LNEQVTQEQPLKDLLHLPELFSPGMLVRCVVSSL 
GITDRGKKSVKLSLNPKNVNRVLSAEALKPGML 
LTGTVSSLEDHGYLVDIGVDGTRAFLPLLKAQEY 
IRQKNKGAKLKVGQYLNCIVEKVKGNGGWSLS 
VGHSEVSTAIATEQQSWNLNNLLPGLVVKAQVQ 
KVTPFGLTL^TFFTGVVDFMHLDPKKAGTYFS 
NQAVRACILCVHPRTRWHLSLRPIFLQPGRPLTR 
LSCQNLGAVLDDVPVQGFFKKAGATFRLKDGVL 
AYARLSHLSDSK3WFNPEAFKPGNTHKCRIIDYS 
QMDELALLSLRTSIIEAQYLRYHDIEPGAVVKGT 
VLTIKSYGMLVKVGEQMRGLVPPMHLADILMK 
NPEKKYHIGDEVKCRVLLCDPEAKKLMMTLKKT 
LIESKLPVITCYADAKPGLQTHGFIIRVKDYGCIV 
KFYNNVQGLVPKHELSTEYIPDPERVFYTGQW 
. KWVLNCEPSKERMLLSFKLSSDPEPKKEPAGHS 
QKKGKAIMGQLVD VICVLEKTKDGLE V A VLPHN 
IRAFLPTSHLSDHVANGPLLHHWLQAGDILHRVL 
CLSQSEGRVLLCRKPALVSTVEGGQDPKNFSEIH 
PGMLLIGFVKSIKDYGVFIQLPSGLSGLAPKAIMS 
DKFVTSTSDHFVEGQTVAAKVTNVDEEKQRMLL 
SLRLSDCGLGDLAITSLLLLNQCLEELQG VRSLM . 
SNRDSVLIQTLAEMTPGMFLDLVVQEVLEDGSV 
VFSGGPVPDLVLKASRYHRAGQEVESGQIGCKVV 
ILNVDLLKLEVHVSLHQ\DLV\NRKARKLRKGSE 
HiQAIVQHLEKSFAIASLVETGHLAAFSLTSHLND 
TFRFDSEKLQVGQGVSLTLKTTEPGVTGLLLAVE 
GPAAKRTMRPTQICDSETVDEDEEVDPALTVGTI 
KKHTLSIGDMVTGTVKSIKPTHVVVTLEDGIIGCI 
HASHILDDVPEGTSPTTKLKVGKTVTARVIGGRD 
MKTFKYLPISHPRFVRTIPELSVRPSELEDGHTAL 
NTHSVSPMEKIKQYQAGQTVTCFLKKYNVVKK 
WLEVEIAPDIRGRIPLLLTSLSFKVLKHPDKKFRV 
GQALRATVVGPDSSKTFLCLSLTGPHKLEEGEVA 
MGRVVKVTPNEGLTVSFPFGKIGTVSIFHMSDSY 
SETPLEDFVPQKWRCYILSTADNVLTLSLRSSRT 
NPETKSKVEDPEINSIQDIKEGQLLRGYVGSIQPH 
GWFRLGPSWGLARYSHVSQHSPSKKALYNKH 
LPEGKLLTARVLRLNHQKNLVELSFLPGDTGKPD 
VLSASLEGQLTKQEERKTEAEERDQKGEKKNQK 
RNEKKNQKGQEEVEMPSKEKQQPQKPQAQKRG 
GRECRESGSEQERVSKKPKKAGLSEEDDSLVDV 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of , 
peptide 
sequence 


Amino acid sequence (A«Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G==Glycine, H=Histidine, 
I^lsoleucine, K=Lysine, L^Leucine, M=Mcthionine, 
N«Asparagine, P=Prolinc, Q*=Glutamine, R=Argioine, S^Serine, 
T^Threonine, V^Valine, W«Tryptophan, Y^Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
\=ppssible nucleotide insertion 










YYREGKEEAEETNVLPKEKQTKPAEAPRLQLSSG 

FAWNVGLDSLTPALPPLAESSDSEEDEKPHQATI 

KKSKKERELEKQKAEKELSRTEEALMDPGRQPE 

SADDFDRLVLSSPNSSILWLQYMAFHLQATEDEK 

ARAVAERALKTISFREEQEKLNVWVALLNLENM 

YGSQESLTKVFERAVQYNEPLKVFLHLADIYAKS 

EKFQEAGELYNRMLKRFRQEKAVWIKYGAFLLR 

RSQ AAA SHRVLQRALECLPSKEHVD VIAKF AQL 

EFQLGDAERAKAIFENTLSTYPKRTDVWSVY1D 

MTIKHGSQKDVRDIFERVIHLSLAPKRMKFFFKR 

YLDYEKQHGTEKDVQAVKAKALEYVEAKSSVL 

ED . 


3470 


A 


2334. 


1226 


TAAAP VAPGTMDDATVLRKKG YIVGENLGKGS Y 

AKVKSAYSERLKFNVAVKDARKKTPTDFVERFL 

PREMD1LATVNHGSIKTYEIFETSDGRIYIIMELG 

VQGDLLEFIKCQGALHEDVARKMFRQLSSAVKY 

CHDLDIVHRDLKCENLLLDKDFNIKLSDFGFSKR 

CLRDSNGRIILSKTFCGSAAYAAPEVLQSIPYQPK 

VYDIWSLGVILYIMVCGSMPYDDSDIRKMLRIQK 

EHR\nDFPRSKKLTCECKDLIYRMLQ\PDVS\KRLH 

EDEILSHSWLQPPKPK\ATSSASFKREGEGKYRAE 

CKLDTKTGLRPDHRPDHICLGAKTQHRLLVVPEN 

ENRMEDRLAETSRAKDHHISGAEVGKAST- 


3471 


A 


537 .... 


148 


TERGAPQHPTLPLPSLTPSSVHTGQPKTTPSVILFL 
PSCEEPQANKATLVCLMNN/FYPGILMVTWKAD 
GTLITQSVEKTTPSKQSNNKYVASSYLSLTPEQW 
RSRRSYSCQVMQEGSTVEKSVAPAECS 


3472 


A . 


1 


2272 


DKPTRHKTYLSSSWAKMAAAEGPVGDGELWQT 

WLPNH V VFLRLREGLKNQSPTE AEKPASS SLPS S 

PPPQLLTRNVVFGLGGELFLWDGEDSSFLVVRLR 

GPSGGGEEPALSQYQRLLCINPPLFEIYQVLLSPT 

QHHVALIGIKGLMVLELPKRWGBCNSEFEGGKST 

VNCSTTPVAERFFTSSTSLTLKHAAWYPSEILiDPH 

VVLLTSDNVIRIYSLREPQTPTNVIILSEAEEESLV 

LNKGRAYTASLGETAVAFDFGPLAAVPKTLFGQ 

NGKDEVVAYPLYILYENGETFLTYISLLHSPGN/I 

WKAVGSIAHAS\AAEDNYGYDACAVLCLPCVPN 

ILVIATESGMLYHCWLEGEEEDDHTSEKSWDSR 

IDLIPSLYVFECVELELALKLASGEDDPFDSDFSC 

PVKLHRDPKCPSRYHCTHEAGVHSVGLTWIHKL 

HKFLGSDEEDICDSLQELSTEQKCFVEHILCTKPLP 

CRQPAPIRGFWIVPDILGPTMICITSTYECLIWPLL 

STVHPASPPLLCTREDVEVAESPLRVLAETPDSFE 

KHIRSILQRSVANPAFLKASEKDIAPPPEECLQLLS 

RATQVFREQYILKQDLAKEEIQRRVKLLCDQKK 

KQLEDLSYCREERKSLREMAERLADKYEEAKEK 

QEDIMNRMKKLLHSFHSELPVLSDSERDMKKEL 

QLIPDQLRHLGNAIKQVTMKICDYQQQKMEKVL 

SLPKPTIILSAYQRKCIQSIIJaEEGEHIREMVKQIN 

DIRNHVNF 


3473 


A ■ 


1 


2272 


DKPTRHKT\1,SSSWAKMAAAEGPVGDGELWQT 

WLPNHWFLRLREGLKNQSPTEAEICPASSSLPSS 

PPPQLLTRNWFGLGGELFLWDGEDSSFLVVRLR 

GPSGGGEEPALSQYQRLLCINPPLFEIYQVLLSPT 

QHHVALIGKGLMVLELPKRWGKNSEFEGGKST 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCysteine, D=Aspartic Acid, 
E=G1utamic Acid, F=PhenylaIanine, G=G lycine, H^Histidine, 
I=Isoleucine, KHLysine, L^Leucinc, M^Mcthionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Argininc, S=Serinc, 
T«Threonine, V=VaIinc, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










WCSTTPVAERPFTSSTSLTLKHAAWYPSEILDPH 

VVLLTSDWIRJYSLREPQTPTNVITLSEAEEESLV 

LNKGRAYTASLGETAVAFDFGPLAAVPKTLFGQ 

NGKDEWAYPLYILYENGETFLTYISLLHSPGNA 

WKAVGSIAHASUAEDNYGYDACAVLCLPCVPN 

mVIATESGMLYHCVVLEGEEEDDHTSEKSWDSR 

IDLPSLYVFECVELELALKLASGEDDPFDSDFSC 

PVKLHRDPKCPSRYHCTHEAG YHSVGLTWIHKL 

HKFLGSDEEDKDSLQELSTEQKCFVEHE.CTKPLP 

CRQPAPIRGFWIVPDILGPTMICITSTYECL1WPLL 

STVHPASPPLLCTREDVEVAESPLRVLAETPDSFE 

KHIRSILQRSVANPAFLKASEKDIAPPPEECLQLLS 

RATQVFREQYILKQDLAKEEIQRRVKLLCDQKK . 

KQLEDLSYCREERKSLREMAERLADKYEEAKEK- 

QEDIMNRMKKLLHSFHSELPVLSDSERDMKKEL 

QL1PDQLRHLGNAIKQVTMKKDYQQQKMEKVL 

SLPKPTIILSAYQRKCIQSILKEEGEHIREMVKQIN 

DIRNHVNF- 


3474 


A 


4344 


2550 . 


DRRREPERHVRVKQRTSVLNMLRRLDKIRFRGH 

KRDDFLDLAESPNASDTECSDEIPLKVPRTSPRDS 

EELRDPAGPGTLIMATGVQDFNRTEFDRLNEIKG 

HLEIALLEKHFLQEELRKLREETNAEMLRQELDR 

ERQRRMELEQKVQEVLKARTEEQMAQQPPKGQ 

AQASNGAERRSQGLSSRLQKWFYERFGEYVEDF 

RFQPEENTVETEEPLSARRLTENMRRLKRGAKPV 

TNFVIO^S ALSD WYS WTSAIAFTVYMN AV\^ 

GWAIPLFLFLAILRLSLNYLIARGWRIQWSIVPEV 

SEPVEPPKEDLTVSEKFQLVLDVAQKAQNLFGK 

MADILEKIKNLFMWVQPEITQKLYVALWAAFLA 

SCFFPYRLVGLAVGLYAGIKFFLIDFIFKRCPRLR 

AKYDTPYIIWRSLPTDPQLKERSSAAVSRRLQTTS 

SRSYVPSAPAGLGKEEDAGRFHSTKKGNFHEIFN 

LTENERPLAVCENGWRCCLINRDRKMPTDYIRN 

GVLYVTVENYLCFESSKSGSSKR^nCVIKLVDITDI. 

QKYKVLSVLPGSGMGIAVSTPSTQKPLVFGAMV 

HRDEAFETILSQYIKITSAAASGGDS 


3475 


A 


2 


1126 


TAARRRQKGAAAAAETHGQAKAKSGWLKPYYF 

mLMESRKDITNQEELWKMI<^RRNLEEDDYLHK 

DTGETSMLKRPVLLHLHQTAHADEFDCPSELQH 

TQELFPQWHLPIKIAAriASLTFLYTLLREVIHPLA 

TSHQQ\TYKIPILVINKVLPMVSITLLALVYLPGV ' 

IAAIVQLHNGTKYKKFPHWLDKWMLTRKQFGL 

LSFFFAVLHAIYSLSYPMRRSYRYKLLNWAYQQ 

VQQNKEDALMEHDVWRMEIYVSLGIVGLAILAL 

LAVTSIPSVSDSLTWREFHYIQSKLGIVSLLLGTIH 

ALIFAWNKWIDIKQFVWYTPPTFMIAVFLPIVVLI 

FKSILFLPCLRKXILKJRHGWEDVTX1NKTEICSQL 


3476 


A' 


143 


3191 


AKAPPTGESSEPEAKVLHTKRLYRAVVEAVHRL 

DLILCNKTAYQEWI<^ENISLRNKLRELCVKLMF 

LHPVDYGRKAEELLWRKVYYEVIQLIKTNKKH1 

HSRSTLECAYRTHLVAGIGFYQHLLLYIQSHYQL 

ELQCCIDWTHVTDPLIGCKKPVSASGKEMDWAQ 

MACHRCLVYLGDLSRYQNELAGVDTELLAERFY 

YQALSVAPQIGMPFNQLGTLAGSKYYNVEAMY 

CYLRCIQSEVSFEGAYGNLKRLYDKAAKMYHQL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A= Ala nine C=Cysteine, I>=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, OGIycine, H^Histidine, 
J=Iso)eucine, K^Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P=ProIine, Q=GIutamine, R«Arginine, S=Serine, 
T^Threonine, V=Vatine, W=Tryptophan, Y«Tyrosine, 
X R Unknown, *<=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 


■ 








KXCETOBXSPGKKRCKDIKJRLLVOT 

PKSSSVDSELTSLCQSVLEDFNLCLFYLPSSPNLS 

LASEDEEEYESG YAFLPDLLIFQMVIICLMC VHSL 

ERAGSKQYSAAIAFTLALFSHLVNHVNIRLQAEL 

EEGENPVPAFQSDGTDEPESKEPVEKEEEPDPEPP 

PVTPQVGEGRKSRKFSRLSCLRRRRHPPKVGDDS 

DLSEGFESDSSHDSARASEGSDSGSDKSLEGGGT 

AFDAETDSEMNSQESRSDLEDMEEEEGTRSPTLE 

PPRGRSEAPDSLNGPLGPSEASIASNLQAMSTQM 

FQTKRCFRLAPTFSNLLLQPTTNPHTSASHRPCV 

NGDVDKPSEPASEEGSESEGSESSGRSCRNERSIQ 

EKLQVLMAEGLLPAVKVFXDWLRTNPDLnVCA 

QSSQSLWNRLSVLLNLLPAAGELQESGLALCPEV 

QDLLEGCELPDLPSSLLLPEDMALRNLPPLRAAH 

RRFNFDTDRPLLSTLEESVVRICCIRSFGHFIARLQ 

GSELQFNPEVGIFVSIAQSEQESLLQQAQAQFRMA 

QEEARKNRLMRDMAQLRLQLEVSQLEGSLQQPK 

AQSAMSPYLVPDTQALCHHLPVIRQLATSGRFIVI 

IPRTVIDGLDLLKKEHPGARDGIRYLEAEFKKGN 

RYIRGQKEVGKSreRHKLKRQDADAWTLYKILD 

SCKQLTVLAQGAGEEDPSGMVTIITGLPLDNPSVL 

SGPMQAALQAAAHASVDIKNVLDFYKQWKEIG 


3477 


A 


1 


3902 


MTEPRERRGYSVPPRPEVGTQATEWRVEESNFN 

KIFLKKDAELGRSNHLPTWDKPEDASWLPQSCL , 

GGDAVATTGEIHEEKAWKTRALEVGQPAQRDIR 

RGEL WGKEHGADQAIQETLEDLSSLERTL VVSES 

SPLGGDCQEVTTLTVKYQVSEEVPSGTVIGKLSQ 

ELGREERRRQAGAAFQVLQLPQALPIQVDSEEGL 

LSTGRRLDREQLCRQWDPCLVSFDVLATGDLALI 

HVEIQVLDINDHQPRFPKGEQELEISESASLRTRJP 

LDRALDPDTGPNTLHTYTLSPSEHFALDVIVGPD 

ETKHAELIVVKELDREIHSFFDLVLTAYDNGNPP 

KSGTSLVKVNVLDSNDNSPAFAESSLALEIQEDA 

APGTLLIKLTATDPDQGPNGEVEFFLSKHMPPEW 

LDTFS1DAKTGQVILRRPLDYEKNPAYEVDVQAR 

DLGPNPffAHCKVLIKX^DVNDNPSIHVTWASQP 

SLVSEALPKDSFIALVMADDLDSGNNGLVHCWL 

SQELGHFRLKRTNGNTYMLLTNATLDREQWPK 

YTLTLLAQDQGLQPLSAKKQLSIQISDINDNAPVF 

EKSRYEVSTRENNLPSLHLITIKAHDADLGINGK 

VSYRIQDSPVAHLVAIDSNTGEVTAQRSLNYEEM 

AGFEFQV1AEDSGQPMLASSVSVWVSLLDANDN 

APEWQPVLSDGKASLSVLVNASTGHLLVPIETP 

NGLGPAGTDTPPLATHSSRPFLLTTIVARDADSG 

ANGEPLYSIRSGNEAHLFELNPHTGQLFVNVTNA 

SSLIGSEWELEIVVEDQGSPPLQTRALLRVMFVTS 

VDHLRDSARKPGALSMSMLTVICLAVLLGIFGLI 

LALFMSICRTEKKDNRAYNCREAESTYRQQPKR ' 

PQKfflQKADIHLVPVLRGQAGEPCEVGQSHKDV 

DKEAMMEAGWDPCLQAPFHLTPTLYRTLRNQG 

NQGAPAESREVLQDTVNLLFNHPRQRNASRENL 

NLPEPQPATGQPRSRPLKVAGSPTGRLAGDQGSE 

EAPQRPPASSATLRRQRHLNGKVSPEKESGPRQI 

LRSLVRLSVAAFAERNPVEELTVDSPPVQQISQLL 

SLLHQGQFQPKPNHRGNKYLAKPGGSRSAIPDTD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
.acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIaninc OCysteine, D*=Aspartic Acid, 
E«=Glutamic Acid, ^Phenylalanine, (^Glycine, H=Histidine, 
I=Isolcucinc, K=Lysinc, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R^Arginine, S^Serine, 
^Threonine, V«Valine, W=Tryptophan, Y=Tyrosine, 
X^Un known, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










GPSARAGGQTDPEQEEGPLDPEEDLSVKQLLEEE 

LSSLLDPSTGLALDRLSAPDPAWMARLSLPLTTN 

YRDNVISPDAAATEEPRTFQTFGKAEAPELSPTG 

TRLASTFVSEMSSLLEMLLEQRSSMPVEAASEAL 

RRLSVCGRTLSLDLATSAASGMKVQGDPGGKTG 

TEGKSRGSSSSSRCL 


3478 


A 


13 


1620 


TLPPPGNSGCHRLCFPEFEFLQVTKMEFSGRKWR 

KLRLAGDQRNASYPHCLQFYLQPPSENISLIEFEN 

LAIDRVKLLKSVENLGVSYVKGTEQYQSKLESEL 

RKLKFSYRENLEDEYEPRRRDHISHFILRLAYCQS 

EELRRWFIQQEMDLLRFRFSELPKDKIQDFLKDSQ 

LQFEAISDEEKTLREQEIVASSPSLSGLKLGFESrY 

KIPFADALDLFRGRKVYLEDGFAYVPLKDrVAIIL 

NEFRAKLSKALALTARSLPAVQSDERLQPLLNHL 

SHSYTGQDYSTQGNVGKISLDQIDLLSTKSFPPC ' 

MRQLHKALRENHHLRHGGRMQYGLFLKGIGLT 

LEQALQFWKQEFEKGKMDPDKFDKGYSYNIRHS 

FGKEGKRTDYTPFSCLKIDLSNPPSQGDYHGCPFR 

HSDPELLKQKLQSYKISPGGISQILDLVKGTHYQ 

V\ACQKYFEMIHTVDDCGFS\LSHPNQYFCESQRI 

LNGGKDIKKEPIQPETPQPKPSVQKTXDASSALA 

SLNSSLEMDMEGLEDYFSEDS 


3479 


A 


698 


138 


RPELEL WRLRSRS WRPLG VPRRCHRRN WKEP VR 
AQPLSVTVWAPRCQRP/QPPAPEPSSPNAAVPEAI 
PTPRAAASAALELPLGPAPVSVAPQAEAEARSTP 
GPAGSRLGPETFRQRFRQFRYQDAAGPREAFRQL 
-REL/SPRQWLRPDI\RTKEQ\IVEMLVQEQLLAILP 
EAARARRJRRRTDVRITG 


3480 


A 


117 


2226 


RRG SRSRGPFAEPAAPGGLCSS SEEKTEEGGMA V 
GLCKAMSQGLVTFRDVALDFSQEEWEWLKPSQ 
KDLYRDVMLENYRNLVWLGLSISKPNMISLLEQ 
GKEPWMVERKMSQGHCADWESWWEIEELSPK 
\WIDEDEISQEMVMERLASHGLECSSFREAWKY 
KGEFELHQGNAERHFMQVTAVKEISTGKRDNEF 
SNAWEKHTPEISIFNTTESVPTIQQVHKFDIYDKLF 
PQNSVIIEYKRLHAEKESLIGNECEEFNQSTYLSK 
DIGIPPGEKPYESHDFSKLLSFHSLFTQHQTTHFG 
KLPHGYDECGDAFSCYSFFTQPQRIHSGEKPYAC 
NDCGKAFSHDFFLSEHQRTHIGEKPYECKECNKA 
FRQSAHLAQHQRIHTGEKPFACNECGKAFSRYAF 
. LVEHQRIHTGEKPYECKECNKAFRQSAHLNQHQ 
RIHTGEKPYECNQCGKAFSRRIALTLHQRIHTGE 
KPFKCSECGKTFGYRSHLNQHQR1HTGEICPYECI 
KCGKFFRTDSQLNRHHRIHTGERPFECSKCGKAF 
SDALVLIHHKRSHAGEKPYECNKCGKAFSCGSY 
LNQHQR1HTGEKPYECSECGKAFHQILSLRLHQRI 
HAGEKPYKCNESQRVRRSELAVSRGLTTKPADT 
GPDSTLNAAKVAEPARAGTEAALRPALSVAESA 
TSLGPLHQGRRFPEAPAAHPGGTGFTVCAS 


3481 


A 


2 


1522 


ASRHGMTPG ALLMLLG ALGPPLAPG VRG SE AEG 
RLREKLFSGYDSSVRPAREVGDRVRVSVGLILAQ 
LISLNEKDEEMSTKVYLDLEWTDYRLSWDPAEH 
DGIDSLRITAESVWLPDWLLNNNDGNFDVALDI 
SVWSSDGSVRWQPPGIYRSSCSIQVTYFPFDWQ 
NCTMVFSSYSYDSSEVSLQTGLGPDGQGHQEIHI 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H^Histidinc, 
I=IsoIeucine, K=Lysine, LHLeucine, M^Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T^Threonine, V=VaIine, W^Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, A=possible nucleotide deletion, 
\=possible nucleotide insertion 










HEGTFEENGQWENIHKPSRLIQPPGDPRGGREGQ 

RQEV1FYLIIRRKPLFYLVNV1APCILITLLAIFVFY 

LPPDAGEKMGLSIFALLTLTVFLLLLADKVPETSL 

SVPniKYLMFTMVL\nTSVIL 

THQMPLWVRQEFIHKLPLYLRLKRPKPERDLMPE 

PPHCSSPGSGWGRGTDEYFIRKPPSDFLFPKPNRF 

QPELSAPDLRRFIDGPNRAVALLPELREVVSS1SYI 

ARQLQEQEDHDALKEDWQFVAMVVDRLFLWTF 

HFTSVGTLWIFLDATYHLPPPDPFP 


3482 


A 


1273 


172 


ERWDSGGADAEWYALADWTAVWLPRSDFYTR 

LQTGEGHVPALRLPAGMPPDSPRELVPKQAPCSP 

SDPALPWTLGHGNQPPAWPEPQGPMGPAG V AA 

RPGRF7GVYLLYCLNPRYRVR\VYVGFTVNTARR 

VQQHNGGRKKGGA\GRTSGRGPWEMVLVVHGF 

PSSVAALRFEWAWQHPHASRRLAHVGPRLRGET 

AFAFHLRVLAHMLRAPPWARLPLTLRWVRPDLR 

QDLCLPPPPHVLLAFGPPPAQVPRPQRRRAGPFD 

DAEPEPDQGDPGACCSLCAQTIQDEEGPLCCPHP 

GCLLRAHVICLAEEFLQEEPGQLLPLEGQCPCCE 

KSLLWGDLIWLCQMDTEKEVEDSELEEAHWTD 

LLET . 


3483 


A . 


230 


3686 


WRP WPCEDTS WNLQVAARTLRVSSAQCGLVPT 

MARVESPVPAARASLTGSCVLGQAMPLRGGAGP 

SPASHGPTHGPSDPRTCLPGRGAGGMRPHGRGA 

LGCCGLCSFYTCHGAAGDEIMHQDIVPLCAADIQ 

DQLKKRFAYLSGGRGQDGSPVITFPDYPAFSEIPD 

KEFQNVMTYLTSIPSLQDAGIGFILVIDRRRDKW 

TSVKASVLRIAASFPANLQLVLVLRPTGFFQRTLS 

DIAFKFNRDDFKMKVPVIMLSSVPDLHGYIDKSQ 

LTEDLGGTLDYCHSRWLCQRTAIESFALMVKQT 

AQMLQSFGTELAETELPNDVQST\SSVLCAHTEK 

KDKAKEDLRLALKEGHS VLESLRELQAEGSEPS V 

NQDQLDNQATVQRLLAQLNETEAAFDEFWAKH 

QQKLEQCLQLRHFEQGFREVKAILDAASQKIATF 

TDIGNSL AHVEHLLRDL ANFQEKS G VF VERARA 

LSLTASSFIGNKHYAVDSIRPKCQELRHLCDQFSA 

EIARRRGLLSKSLELHRRLETSMKWCDEGIYLLA 

SQPVDKCQSQDGAEAALQEIEKFLETGAENKIQE 

LNAIYKEYESILNQDLMEHVRXVFQKQASMEEV. 

FHRRQASLKKLAARQTRPVQPVAPRPEALAKSP 

CPSPGIRRGSENSSSEGGALRRGPYRRAKSEMSES 

RQGRGSAGEEEESLAILRRHVMSELLDTERAYVE 

ELLCNO-EGYAAEMDNPLMAHLLSTGLHNKKDV 

LFGNMEErYHFHNRIFLREIJENYTDCPELVGRCF 

LERMEDFQIYEKYGQNKPRSESLWRQCSDCPFFQ 

ECQRKLDHKLSLDS YLLKPVQRITKYQLLLBCEM 

LKYSRNCEGAEDLQEALSSILGILKAVNDSMHLr 

AITGYDGNLGDLGKLLMQGSFSVWTDHKRGHT . 

KVKELAJRFKPMQRHLFLHEKAVLFCKKREENGE 

GYEKAPSYSYKQSLNMAAVGITENVKGDAKKFE 

IWYNAREEVYIVQAPTPEIICAAWVNEIRKVLTSQ 

LQACREASQHRALEQSQSLPLPAPTSTSPSRGNSR 

NIKKLEERKTDPLSLEGYVSSAPLTKPPEKGKGW 

SKTSHSLEAPEDDGGWSSAEEQINSSDAEEDGGL 

GPKKLVPGKYTWADHEKGGPDALRVRSGDVV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhcnylaIanine,G=Glycine,H=Histidine, 
I=Isoleucinc, K = Lysine, L^Lcucine, M=Methionine, 
N=Asparagine, P=Proline, Q=G1utamine, R^Arginine, S=Serinc, 
•^Threonine, V=Valine, W=Tryptophari, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=possibIe nucleotide deletion, 
^possible nucleotide insertion 










ELVQEGDEGLW 


3484 


A 


208 


6103 


VTMAQQAADKYLYVDKNFINNPLAQADWAAK 

KLVWVPSDKSGFEPASLKEEVGEEAIVELVENGK 

K\^V3^DIQKMNPPKFSKVED1VL«LTCL>[EAS 

VLHNLKERYYSGLIYTYSGLFCWINPYKNLPIYS 

EEIVEMYKGKKRHEMPPHIYAITDTAYRSMMQD 

REDQSILCTGESGAGKTENTXKVIQYLAYVASSH 

KSKKDQGELERQLLQANPILEAFGNAKTVKNDN 

SSRFGKFIRINFDVNGYIVGAMETYLLEKSRAIRQ 

AKEERTFHIFYYLLSGAGEHLKTDLLLEPYNKYR 

FLSNGHVTIPGQQDKDMFQETMEAMRIMGIPEEE 

QMGLLRVISGVLQLGNIVFKKER>rrDQASMPDN 

TAAQKVSHLLGINVTDFTRGILTPRIKVGRDYVQ 

KAQTKEQADFAIEALAXATYERMFRWLVLRINK 

ALDKTKRQGASFIGDLDIAGFEIFDLNSFEQLCINY 

TNEKLQQLFNHTMFILEQEEYQKEGIEWNFIDFG 

LDLQPCIDLIEKPAGPPGELALLDEECWFPKATDK 

SFVEKVMQEQGTHPKFQKPKQLKDKADFCIIHY 

AGKVDYKADEWLMKNMDPLNDNIA TLLHQSSD 

KFVSELWKDVDRIIGLDQVAGMSETALPGAFKT 

RKGMFRTVGQLYKEQLAKLMATLRNTNPNFVR 

CIIPNHEKKAGKLDPHLVLDQLRCNGVLEGIRICR 

QGFPNRVVFQEFRQRYEILTPNSIPKGFMDGKQA 

CVLM1KALELDSNLYRIGQSKVFFRAGVLAHLEE 

ERDLKITDVHGFQACCRGYLARKAFAKRQQQLT 

AMKVLQRNCAAYLKLRNWQWWRLFTKVKPLL 

QVSRQEEEMMAKEEELVKVREKQLAAENRLTE 

METLQSQLMAEKLQLQEQLQAETELGAEAEELR 

ARLTAK\KQ\ELEEICHDLEARVEEEEERCQHLQA 

EKKKMQQNIQELEEQLEEEESARQKLQLEKVTT 

EAKLKKLEEEQIELEDQNCKLAKEKKLLEDRIAEF 

TTNLTEEEEKSKSLAKLKMKHEAMITDLEERLRR 

EEKQRQELEKTRRXLEGDSTDLSDQIAELQAQMA 

ELKMQLAKKEEELQAALARVEEEAAQKNMALK 

KIRELESQISELQEDLKCER\ASR>JKAEKQICRDLG 

EELEALKTELEDTLDSTAAQQELRSKREQEVNIL 

KKTLEEEAKTHEAQIQEMRQKHSQAVEELAEQL 

EQTKRVKANLEKAKQTLENERGELANEVKVLLQ 

GKGDSEHKRKKVEAQLQELQVKFNEGERVRTEL 

ADKVTKLQVELDNVTGLLSQSDSKSSKLTKDFS 

ALESQLQDTQELLQEENRQKLSLSTKLKQVEDE . 

KNSVFREQLEEEEEEAKHNLEKQIATLHAQVADM : 

KKKMEDSVGCLETAEEVKRKLQKDLEGLSQRHE 

EKVAAYDKLEKTKTRLQQELDDLLVDLDHQRQ 

SACNLEKKQKKFDQLLAEEKTISAKYAEERDRA 

EAEAREKETKALSLARALEEAMEQKAELERLNK 

QFRTEMEDLMSSKDDVGKSVHELEKSKRAIEQQ 

VEEMKTQLEELEDELQATEDAKLRLEVNLQAM 

KAQFERDLQGRDEQSEEKKKQLVRQVREMEAE 

LEDERKQRSMAVAARKKLEMDLKDLEAHIDSA 

NKNRDEAIKQLRKLQAQMKDCMRELDDTRASR 

EEILAQAKENEKKLKSMEAEMIQLQEELAAAER 

AKRQAQQERDELADEIANSSGKGALALEEKRRL 

EARIAQLEEELEEEQGNTELINDRLKKANLQIDQI 

OTDLNLERSHAQKNENARQQLERQMCELKVKL 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
add residue of 
peptide - 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCysteine, D^Aspartic Acid, 
E=Glutamic Acid, F*=PhcnyIalaninc, G=Glycinc, H=Histidine, 
I=Isoleucine, K=Lysine, Lr=Leucine, M=Mcthionine, 
N=Asparagine, P^ProIinc, QsGlutaminc, R^Arginine, S=Serine, 
T^Threonine, V-Valine, W*=Tryptophan, Y^Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
\= possible nucleotide insertion 










QEMEGTVICSKYKASITALEAKIAQLEEQLDNETK 

ERQAACKQVRRTEKKLKDVLLQVDDERRNAEQ 

YKDQADKASTRLKQLKRQLEEAEEEAQRANASR 

RKLQRELEDATETADAMNREVSSLKNKLRRGDL 

PFVVPRRMARKGAGDGSDEEVDGKADGAEAKP 

AE 


3485 


A 


2 


1782 


CSTGVSKAPLTYLMSYGFELGWRKGNRAVACR 

EDRGGESVGMGQESBLSQVHWWEAEPVEKTPGR 

DSEATIMSLRVHTLPTLLGAVVRPGCRELLCLLM 

ITVTVGPGASGVCPTACICATDIVSCTNKNLSKVP 

GNLFRLDCRLDLSYNRIGLLDSEWIPVSFAKLNTL 

ILRHNNITSISTGSFSTTPNLKCLDLSSNKLKTVVK 

NAVFQELKVLEVLLLYNNHISYLDPSAFGGLSQL 

QKLYLSGNFLTQFPMDLYVGRFKLAELMFLDVS 

YNRIPSMPMHHINLVPGKQLRGIYLHGNPFVCD\ 

CSLVSLLVFWYRRHFS S VMDFKNDYTCRLWSDS 

RHSRQVLLLQDSFMNCSDSIINGSFRALGFIHEAQ 

VGERLMVHCDSKTGNANTDFIWVGPDNRLLEPD 

KEMENFYVFHNGSLVIESPRFEDAGVYSCIAMNK 

QRLLNETYDVTINVSNFTVSRSHAHEAFNTAFTT 

LAACVASIVLVLLYLYLTPCPCKCKTKRQKNML 

HQSNAHSSILSPGPASDASADERKAGAGKRVVFL 

EPLKDTAAGQNGKVRLFPSEAVIAEGILKSTRGK 

SDSDSVNSVFSDTPFVAST 


3486 


A 


357 


1173 


GDPRETKVFPSRSFAR>3TVGVSHHQSHLFHTVSR 

IYVEDKHKILYCEVPKAGCSNWKRILMVLNGLA 

SSAYNISHNAVHYGKHLKKLDSFDLKGIYTRLDT 

YTICVLVLVRDPMERLVSAFRDKFDHPNSYYHPVF 

GKAJIKKYRPNACEEALINGSGVKFKEFIHYLLDS 

HRPVGMDIHWEKVSKLCYPCLINYDFVGKFETL 

EEDANYFLQMIGAPKELKFPNFKDRHSSDERTOA 

QVVRQYLKDLTRTERQLIYDFYYLDYLMFNYTT 

PFL 


3487 


A 


2 


3281 


CDKSGAVPFSTTRSPRRPSPRSAGPSLSSVSPRSQ 

LWASSGLSEEHAAPLLPAWPRHPCPPSLTPGPSM 

AQGAMRFCSEGDCAISPPRCPRRWLPEGPVPQSP 

PASMYGSTGSLLRRVAGPGPRGRELGRVTAPCTP 

LRGPPSPRVAPSPWAPSSPTGQPPPGAQSSWIFR 

FVEKASVRPLNGLPAPGGLSRSWDLGGVSPPRPT 

PALGPGSNRKLRLEASTSDPLPARGGSALPGSRN 

LVHGPPAPPQVGADGLYSStPNGLGDPPERLATL 

FGGPADTGFLNQGDTWSSPREVSSHAQRIARAK 

WEFFYGSLDPPSSGAKPPEQAPPSPPGVGSRQGS 

GYAVGRAAKYSETDLDTVPLRCYRETDIDEVLA 

EREEADSAIESQPSSEGPPGTAYPPAPRPGPLPGP . 

HPSLGSGNEDEDDDEAGGEEDVDDEVFEASEGA 

RPGSRMPLKSPVPFLPGTSPSADGPDSFSCVFEAI 

LESHRAKGTS YTSLASLEALASPGPTQSPFFTFEL 

PPQPPAPRPDPPAPAPLAPLEPDSGTSSAADGPWT 

QRGEEEEAEARAKLAPGREPPSPCHSEDSLGLGA 

APLGSEPPLSQLVSDSDSELDSTERLALGSTDTLS 

NGQKADLEAAQRLAKRLYRLDGFRKADVARHL 

GKNNDFSKLVAGEYLKFFVFTGMTLDQALRVFL 

KELALMGETQERERVLAHFSQRYFQCNPEALSSE 

DGAHTLTCALMLLNTDLHGHNIGKRMTCGDFIG 
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SEQ n> 
NO: 


Method 


Predicted. 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A~Alanine OCysteine, D=Aspnrtic Acid, 
E=Glutamic Acid, F*=Phenylalanine, G=Glycine, HHHistidine, 
I=Isoleucine, K^Lysine, LHLeucine, M— Methionine, 
N=Asparagine, P^Proline, Q=GIutamine, R=Arginine, S=Serine, 
T^Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X*=Unknown, *=Stop codon, /=possiblc nucleotide deletion, 
^possible nucleotide insertion 










NLEGLlSfDGGDFPRELLKALYSSIKNEKLQWAIDE 

EELRRFLSELADPNPKVIKRISGGSGSGSSPFLDLT 

PEPGAAVYKHGALVRKVHADPDCRKTPRGKRG 

WKSFHGILKGMILYLQKEEYKPGKALSETELKN 

AISIHHALATRAS\hTY P SKRPHVFYLRTADWRVFL 

FQAPSLEQMQSWITRINVVAAMFSAPPFPAAVSS 

QKXFSRPLLPSAAT1U.SQEEQVRTHEAI<XICAMA 

SELREHRAAQLGKKGRGKEAEEQRQKEAYLEFE 

KSRYSTYAALLRVKLKAGSEELDAVEAALAQAG 

STEDGLPPSHSSPSLQPKPSSQPRAQRHSSEPRPG 

AGSGRRKP 


3488 


A 


441 


1968 


GTETPHCWGRGTAGLRRELDREERDGPGTATMS 

FPHFGHPYRGAFQFLVASASSSTTCCESTLRSVSY 

VASGSTPAPALCCAPWDSRLLGSARPELGAALGI 

YGAPYAAAAAAQSYPGYLPYSPEPPSLYGALNP 

QYEFKEAAGSFTSSLAQPGAYYPYERTLGQYQY 

ERYGAVELSGAGRRKNATRETTSTLKAWLNEHR 

KNPYPTKGEKIMLAIITKMTLTQVSTWFANARRR 

LKKENKMTWAPKhTKGGEERKAEGGEEDSLGCL 

TADTKEVTASQEARGLRLSDLEDLEEEEEEEEEA 

EDEE WATAGDRLTEFRICG A QSLPGPC AAAREG 

RLERRECGLAAPRFSFNDPSGSEEADFLSAETGSP 

RLTMHYPCLEKPRIWSLAHTATASAVEGAPPARP 

RPRSPECRMIPGQPPASARRLSVPRDSACDESSCI 

PKAFGNPKFALQGLPLNCAPCPRRSEPVVQCQYP 

SGAEGSGPPAALGVSMQKTPTYRPARQLHTLCH 

SSLP 


3489 


A 


718 


2073 


1AAYHKALSYRGHVHANNRGTNNVHFTPPPSPS 

RGILPN^KNMMhfflSQVGQGIGIPSRTNSMSSSG 

LGSPNRSSPSIICMPKQQPSRQPFTVNSMSGFGMN 

RNQAFGMNNSLSSNIFNGTDGSENVTGLDLSDFP 

ALADRNRREGSGNPTPLINPLAGRAPYVGMVTK 

PANEQSQDFSIHNEDFPALPGSSYKDPTSSNDDSK 

SNLNTSGKTTSSTDGPKFPGDKSSTTQNNNQQKK 

GIQVLPDGRVTNIPQGMVTOQFGMIGLLTFIRAA 

ETDPGMVHLALGSDLTTLGLNLNSPENLYPKFAS 

PWASSPCRPQDIDFHVPSEYLTNIHIRDKLFFFFS 

W/TAIKLGRYGEDLLFYLYYMNGGDVLQLLAAV 

ELFNRDWRYHKEERVWITRAPGMEPTMKTNTY 

ERGTYYFFDCLNWRKVAKEFHLEYDKLEERPHL 

PSTFNYNPAQQAF 


3490 


A 


2 


2833 


FVAKMATSQYFDFAQGGGPQYSTQAPTLPLPTV 

GASYTGQPTPGMDPAVNPAFPPAAPAGYGGYQP 

HSGQDFAYGSRPQEPVPTATTMATYQDSYSYGQ 

SAAARSYEDRPYFQSAALQSGRMTAADSGQPGT 

QEACGQPSPHGSHSHAQPPQQAPIVESGQPASTL 

SSGYTYPTATGVQPESSASIVTSYPPPSYNPTCTA 

YTAPSYPNYDASVYSAASPFYPPAQPPPPPGPPQ 

QLPPPPAPAGSGSSPRADSKPPLPSKLPRPKAGPR 

QLQLHYCDICKISCAGPQTYREHLGGQKHRKKE 

AAQKTGVQPNGSPRGVQAQLHCDLCAVSCTGA 

DAYAAHIRGSKHQKVFKLHAKLGBCP1PTLEPALA 

TESPPGAEAKPTSPTGPSVCASSRPALAKRPVASK 

ALCEGPPEPQAAGCRPQWGKPAQPKLEGPGAPT 

QGGSKEAPAGCSDAQPVGPEYVEEVFSDEGRVL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine OCysteine, D-Aspartic Acid, 
E*=Gfutamic Acid, F=PhenylaIanine, G=Glycine, H=Histidine, 
I=Isoleucine, K-Lysine, D=Leucinc, M==Methionine, 
N=Asparagine, P^ProIine, Q=Glutamine, R=Arginine, S=Serine, 
TVThreonine, V=Valine, W^Tryptophan, Y«Tyrosine, 
X=Unknown, *«=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










RFHCKLCECSFNDLNAKDLHVRGRRHRLQYRKK 

VNPDLPIATEPS SRARKVLEERMRKQRHL AEERL 

EQLRRWHAERRRLEEEPPQDVPPHAPPDWAQPL 

LMGRPESPASAPLQPGRJRPASSDDRHVMCKriATI 

YPTEQELLAVQRAVSHAERALKLVSDTLAEEDR 

GRREEEGDKRSSVAPQTRVLKGVMRVGILAKGL 

LLRGDRNVRLALLCSEKPTHSLLRRIAQQLPRQL 

QMVTEDEYEVSSDPEANWISSCEEPRMQVTISVT 

SPLMREDPSTDPGVEEPQADAGDVLSPKKCLESL 

AALRHARWFQARASGLQPCVIVIRVLRDLCRRV 

PT\WGALPAWAMELLVEICAVSSAAGPLGPGDAV 

RRVLECVATGTLLTDGPGLQDPCERDQTDALEP 

MTLQEREDVTASAQHALRMLAFRQTHKVLGMD 

LLPPRHRLGARFRKRQRGPGEGEEGAGEKKRGR 

RGGEGLV 


3491 


A 


2. 


1323 


FVGDGALSGCRRGRAPRVPSMAGSLPPCVVDCG 
TGYTKLGYAGNTEPQFnPSCIAIRESAKVVDQAQ 
RRVLRG VDDLDFFIGDE AIDKPTY ATK WPIRH Gil . 
EDWDLMERJFMEQVVFKYLRAEPEDHYFLMTEP 
PLNTPENREYLAEIMFESFNVPGLYIAVQA VLAL 
AASWTSRQVGERTLTGIVIDSGDGVTHVIPVAEG 
YVIGSCDCHIPIAGRDrrYFIQQLLREREVGIPPEQS 
LETAKAIKEKYCYICPDI VKEFAKYD VDPRK WIK 
QYTGINAINQKKFVIDVG YERFLGPEIFFHPEFAN 
PDFMESISDWDEVIQNCPDDVRRPLYKNWLSG 
GSTMFRDFGRRLQRDLKRV VDARLRLSEELSGG\ 
RIKPKPVEVQWTHHMQRYAV\WFGG\SMLASTP 
EFFQVCHTKKDYEEYGPSICRHNPVFGVMS 


3492 


A 


3. 


2024 


PNGVALLHLPGAAVIPNTNYMFQDALGGRSRGS 

REESPAPSRAPASASLWRRLVVVEAKMAAHAAA 

AAQAAAAQAAHAEAADSWYLALLGFAEHFRTS 

SPPKIRLCVHCLQAVFPFKPPQRIEARTHLQLGSV 

LYHHTKNSEQARSHLEKAWLISQQIPQFEDVKFE 

AASLLSELYCQENSVDAAKPLLRKAIQISQQTPY 

WHCRLLFQLAQLHTLEKDLVSACDLLGVGAEY 

ARVVGSEYTRALFLLSKGMLLLMERKLQEVHPL 

LTLCGQrVENWQGNPIQKESLRVFFLVLQVTOYL 

DAGQVKSVKPCLKQLQQCIQTISTLHDDEILPSNP 

ADLFHWLPKEHMCVLVYLVTVMHSMQAGYLE 

KAQKYTDKALMQLEKLKMLDCSPILSSFQVILLE 

HnMCRLVTGHKATALQEISQVCQLCQQSPRLFS 

NHAAQLHTLLGLYCVSVNCMDNAEAQFTTALR 

LTOTQELWAFIVTOLASVYIREGNRHQEVV\LYS 

LLERINPDHSFPVSSHCLRAAAFYVRGLFSFFQGR 

YNEAKRFLRETLKMSNAEDLNRLTACSLVLLGHI 

FYVLGNHRESNNMVVPAMQLASKIPDMSVQLW 

SSALLRDLNKACGNAMDAHEAAQMHQNFSQQL 

LQDHIEACSLPEHNLITWTDGPPPVQFQAQNGPN 

TSLASLL 


.3493 


A 


3 


2024 


PNGVALLHLPGAAVIPNTNYMFQDALGGRSRGS 
REESPAPSRAPASASLWRRLVVVEAKMAAHAAA 
AAQAAAAQAAHAEAADSWYLALLGFAEHFRTS 
SPPKIRLCVHCLQAVFPFKPPQRIEARTHLQLGSV 
LYHHTKNSEQARSHLEKAWLISQQIPQFEDVKFE 
AASLLSELYCQENSVDAAKPLLRKAIQISQQTPY 
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SEQBD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino . 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide, 
sequence 


Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G«Glycine, H=Histidine, 
I=Isoleucine, K.=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q^Glutamine^^Arginine, S=Serine, 
T=Threonine, V=Valine, W«=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon^possible nucleotide deletion, 
\=pos5ible nucleotide insertion 










WHCRLLFQLAQLHTLEKDLVSACDLLGVGAEY 

ARWGSEYTRALFLLSKGMLLLMERKLQEVHPL 

LTLCGQIVENWQGNPIQKESLRVFFLVLQVTHYL 

DAGQVKSVKPCLKQLQQCIQT1STLHDDEILPSNP 

ADLFHWLPKEHMCVLVYLVTVMHSMQAGYLE 

KAQKYTDKALMQLEKLKMLDCSPILSSFQVILLE 

HIIMCRLVTGHKATALQEISQVCQLCQQSPRLFS 

NrLM.QLHTLLGLYCVSVNCMDNAEAQFTTALR 

LTNHQELWAFIVTNLASVYIREGNRHQEWXLYS 

LLERINPDHSFPVSSHCLRAAAFYVRGLFSFFQGR 

YNEAKRFLRETLKMSNAEDLNRLTACSLVLLGHI 

FYVLGNHRESNNMVWAMQLASKIPDMSVQLW 

SSALLRDLNKACGNAMDAHEAAQMHQNFSQQL 

LQDHffiACSLPEHNLITWTDGPPPVQFQAQNGPN 

TSLASLL 


3494 


A 


2 


1615 


VLRGQRGPAGGLAEERRRGRNEWRIHDVTTAPF 

PGLVQRRSRLLIVSQVRYFLKNKVSPDLCNEDGL 

TALHQCCIDNFEEIVKLLLSHGANVNAKDNELW 

TPLHAAATCGHINLVKILVQYGADLLAVNSDGN 

MPYDLCEDEPTLDVIETCMAYQGITQEKINEMRV 

APEQQMIADIHCMIAAGQDLDWIDAQGATLLHI 

AG AN G YLRAAELLLDHG VRVD VKD WDG WEPL - 

HAAAFWGQMQMAELLVSHGAMLNARTSMDE 

MPIDLCEEEEFKVLLLELK\HKHDVIMKSQLRHK 

SSLSRRTSHRQAS/SVGKVVRRTQPVGTGPNLAYR 

KEYE/GEEAELWQRSA\AEDQRTSTYNGDIRET\R 

TDQENKDPNPRLEKVPVLLSEFPTKIPRGELDMPV 

ENGLRAPVSAYQYALANGDVWKVHEVPDYSM 

AYGNPGVADATPPWSSYKEQSPQTLLELKRQRA 

AAICLLSHPFLSTHLGSSMARTGESSSEGKAPLIG 

GRTSPYSSNGTSVYYTVTSGDPPLLKFKAPIEEM 

EEKVHGCCRIS 


3495 


A 


327 


1078 


APK1ADTTPNGPQGAGAVQFMMTNKLDTAMWL 

SRLFTVYCSALFVLPLLGLHEAASFYQRALLANA 

LTSALRLHQRLPHFQLSRAFLAQALLEDSCHYLL 

YSLIFVNSYPVTMSIFPVLLFSLLHAATYTKKVL\ 

DARG\SNSLPLLR\SVLDKLSANQQNILKF1ACNEI 

FLMPATWMLFSGQGSLLQPFIYYRFLTLRYSSRR 

NPYCRTLFNELRIVVEHIIMKPACPIJ : VRRLCLQS 

IAFISRLAPTVP 


3496 




3 


2867 


SSRTREMEEKEELRRQIRLLQGLIDDYKTLHGNAP 

APGTPAASGWQPPTYHSGRAFSARYPRPSRRGYS 

SHHGPSWRKKYSLVNRPPGPSDPPADHAVRPLH 

GARGGQPPVPQQHVLERQVQLSQGQNVVIKVKP 

PSKSGSASASGAQRGSLEEFEDTPWSDQRPREGE 

GEPPRGQLQPSRPTRARGTCSVEDPLLVCQKEPG 

KPRMVKSVGSVGDSPREPRRTVSESVIAVKASFP 

SSALPPRTGVALGRKLGSHSVASCAPQLLGDRRV 

DAGHTDQPVPSGSVGGPARPASGPRQAREASLV 

VTCRTNKFRIQ>nsTOWVAASSKSPRVARRALSPR 

VAAENVCKASAGMANKVEKPQLIADPEPKPRKP 

ATSSKPGSAPSKYKWKASSPSASSSSSFRWQSEA 

GSKDHASQLSPVLSRSPSGD\RPALAHSGLKPLSG 

ETPLSAYKVKTRTBOIRRRGSTSLPGDKKSGTSPA 

ATAKSHLSLRRRQALRGKSSPVLKKTPNKGLVQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alaninc OCysteine, D^Aspartic Acid, 
E=Glutamic Acid, F-Phenylalaninc, G^Glycint, H=Histidine, 
I=Isoleucine, K=*Lysine, L^Lencine, M^Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginihe, S=Serine, 
T=Threoninc, V=Valine, W^Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, /possible nucleotide deletion, 
\=possible nucleotide insertion 










VTKHRLCRLPPSRAHLPTKE A S SLHA VRTAPTSK 

VIKTTIYRIVKKTPASPLSAPPFPLSLPSWRARRLS 

LSRSLVLNRLRPVASGGGKAQPGSPWWRSKG^Tl 

CIGGVLYKVSANKLSKTSGQPSDAGSRPLLRTGR 

LDPAGSCSRSLASRAVQRSLAIIRQARQRREKRK 

EYCMYYNRFGRCNRGERCPYIHDPEKVAVCTRF 

VRGTCKKTDGTCPFSHHVSKEKMPVCSYFLKGI 

CSNSNCPYSHVYVSRKAEVCSDFLKGYCPLGAK 

CKKKHTLLCPDFARRGACPRGAQCQLLHRTQKR 

HSRRAATSPAPGPSDATARSRVSASHGPRKPSAS 

QRPTRQTPSSAALTAAAVAAPPHCPGGSASPSSS 

KASSSSSSSSSPPASLDHEXAPSLQEAALAAACSN 

RLCKLPSFISLQSSPSPGAQPRVRAPRAPLTKDSG 

KPLHIKPRL 


3497 


A 


1586 


141 


ATARDLGCARRIDRVVMESTPSRGLNRVHLQCR 

NLQEFLGGLSPG VLDRLYGHPATCLAVFRELPSL 

AKNWVMRMLFLEQPLPQAAVALWVKKEFSKA 

QEESTGLLSGLRIWHTQLLPGGLQGLILNPIFRQN 

LRJALLGGGKAWSDDTSQLGPDKHARDVPSLDK 

YAEERWEVVLHFMVGSPSAAVSQDLAQLLSQA 

GLMKSTEPGEPPC1TSAGFQFLLLDTPAQLWYFM 

LQYLQTAQSRGMDLVEILSFLFQLSFSTLGKDYS 

VEGMSDSLLNFLQHLREFGLVFQRKRKSRRYYP 

T/RALAINLSSGVSGAGGWHQPGFIV\VETOYRL 

YA YTESELQ1ALIALFSEMLYPFP\NMVV\ARVTR\ 

ESVQQAIASGITAQQIIHFLRTRAHPVMLKQTPVL 

PPTfTDQIRLWELERDRLRFTEGVLYNQFLSQVDF 

ELL\LAHAPICLGVLVFE/NTPAKRLMVVTPAGHS 

DVKRFWKRQKHSS 


3498 


A 


790 


190 ■ ■ 


RDLGPAALMTASASSFSSSQGVQQPSIYSFSQITR 

SLFLSNGVAANDKLLLSSNRITAIVNAS VGSGQRI 

LRG\LQYIKVPVTDARDSRLYDFFDPIADLIHTVS 

MRQGRTLLNCMAG\MSRSASLCLAYLMKYHSM 

S\LLDAHTWA/TKSRRPnRPNNGFWEQLINYEFK 

LFNNNTVRMINSPVGNIPDIYEKDLRMMISM 


3499 


A 


31 


1586 


TAGFLLAPLEMQRLLTPVKRILQLTRAVQETSLT . 

PARLLPVAHQRFSTASAVPLAKTOTWPKDVGIL 

ALEVYFPAQYVDQTDLEKYNNVEAGKYTVGLG 

QTRMGFCSVQEDINSLCLTVVQRLMERIQLPWD 

SVGRLEVGTETIIDKSKAVKTVLMELFQDSGNTD 

IEGIDTTNACYGGTASLFNAANWMESSSWDGRY 

AMVVCGDIA VYPSGNARPTGGAGAVAMLIGPK 

APLALERGLRGTHMENVYDFYKPNLASEYPrVD 

GKLSIQCYLRALDRCYTSYRKKIQNQWKQAGSD 

RPFTLDDLQYMEFHTPFCKMVQKSLARLMFNDF 

LSASSDTQTSLYKGLEAFGGLKLEDTYTNKDLD 

KALLKASQDMFDKKTKASLYLSTHNGNMYTSSL 

YGCLASLLSHHSAQELAGSRIGAFSYGSGLAASF 

FSFRVSQDAAPGSPL\DKLVSSTSDLPKRLASRKC 

VSPEEFTEIMNQREQFYHKVNFSPPGDTNSLFPGT 

WYLERVDEQHRRKYARRPV 


3500 


A . 


185 


2692 . 


MLPTEVPQSHPGPSALLLLQLLLPPTSAFFPNIWS 
LLAAPGSITHQDLTEEAALNVTLQLFLEQPPPGRP 
PLRLEDFLGRTLLADDLFAAYFGPGSSRRFRAAL 
GEVSRANAAQDFLPTSRNDPDLHFDAERLGQGR 
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SEQU) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
IXSIutamic Acid, F=Phenylalanine, G=Glycine, H-Histidine, 
I=Isoleucine, K^Lysine, L*=Leucinc, M=Methionine, 
N=Asparagine,P=ProIine, Q*=Glntamine, R=Arginine, S-Serine, 
T=Threonine, V=VaIine, W*=Tryptophan, Y^Tyrosine, 
XMUnknown, **=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion . 










ARLVGALRETVVAARALDHTLARQRLGAALHA 

LQDFYSHSNWVELGEQQPHPHLLWPRQELQNLA 

QVADPTCSDCEELSCPRNWLGFTLLTSGYFGTHP 

PKPPGKCSHGGHFDRSSSQPPRGGINKDSTSPGFS 

PHHMLHLQAAKLALLASIQAFSLLRSRLGDRDFS 

RLLDITPASSLSFVLDTTGSMGEEINAAKIQARHL 

VEQRRGSPMEPVHYVLVPFHDPGFGPVFTTSDPD 

SFWQQLNEIHALGGGDEPEMCLSALQLALLHTPP 

LSDIFVFTDASPKDAFLTNQVESLTQERRCRVTFL 

VTEDTSRVQGRARREILSPLRFEPYKAVALASGG 

EVIFTKDQHIRDVAAIVGESMAALVTLPLDPPVV 

VPGQPLWSVDGLLQKITVRIHGDISSFWIKNPAG 

VSQGQEEGGGPLGHTRRFGQFWMVTMDDPPQT 

GTWEIQVTAEDTPGVRVQAQTSLDFLFHFGIPME 

DGPHPGLYPLTQPVAGLQTQLLVEVTGLGSRAN , 

PGDPQPHFSHVILRGVPEGAELGQVPLEPVGPPE 

RGLLAASLSPTLLSTPRPFSLELIGQDAAGRRLHR 

AAPQPSTWPVLLELSGPSGFLAPGSKVPLSLRIA 

SFSGPQDLDLRTFVNPSFSLTSNLSRAHLELNESA 

WGRLWLEVPDS A APDS V VMVTVTAGGREANPV 

PPTHAFLRLLVSAPAPQDRH 


3501 


A 


1245 


5815 


RRAHPSHSRLSPYLSVSRDPYFFVTVSRTILTLSA 

PAPPRRTPAPSMGTALLQRGGCFLLCLSLLLLGC 

WAELGSGLEFPGAEGQWTRFPKWNACCESEMSF 

QLKTRSARGLVLYFDDEGFCDFLELILTRGGRLQ 

LSFSIFCAEPATLLADTPVNDGAWHSVRIRRQFR 

NTTLFIDQVEAKWVEVKSKJIRDMTVFSGLFVGG 

LPPELRAA ALKLTLA S VREREPFKG WIRD VRVNS 

SQVLPVDSGEVKLDDEPPNSGGG\SPCEAGEEGE 

GGVCLNGGVCSWDDQAVCDCSRTGFRGKDCS 

QEDNNVEGLAHLMMGDQGKEEY1ATFKGSEYF 

CYDLSQNPIQSSSDEITLSFKTLQRNGLMLHTGKS 

ADYVNLALKNGAVSLVINLGSGAFEALVEPVNG 

KFNDNAWHDVKVTRNLRQHSGIGHAMVTISVD 

GILTTTGYTQEDYTMLGSDDFFYVGGSPSTADLP 

GSPVSNNFMGCLKEVVYKNNDVRLELSRLAKQ 

GDPKMKIHGVVAFKCENVATLDPITFETPESFISL 

PKWNAKKTGSISFDFRTTEPNGL3LFSHGKPRHQ 

KDAKHPQMIKVDFFAIEMLDGHLYLLLDMGSGT 

1KIKALLKKVNDGE\WHVDFQRDGRSGTISVNT 

LRTPYTAPGESEILDLDDELYLGGLPENKAGLVF 

PTEVWTALLNYGYVGCIRDLFIDGQSKDIRQMA 

EVQSTAGVKPSCSKETAKPCLSNPCICNNGMCRD 

GWNRYVCDCSGTGYLGRSCEREATVLSYDGSM 

FMKIQLPWMHTEAEDVSLRFRSQRAYGILMAT 

TSRDSADTLRLELDAGRVKLTVNLDCIRINCNSS 

KGPETLFAGYNLNDNEWHTVRVVRRGKSLKLT 

VDDQQAMTGQMAGDHTRLEFHNlETGnTERRY 

LSSVPSNFIGHLQSLTFNGMAYIDLCKNGDIDYC 

ELNARFGFRNIIADPVTFKTKSSYVALATLQAYT 

SMHLFFQFKTTSLDGLILYNSGDGNDFIVVELVK 

GYLHYVFDLGNGANLIKGSSNKPLNDNQWHNV 

M1SRDTSNLHTVKIDTKITTQITAGARNLDLKSDL 

YIGGVAKETYKSLPKLVHAKEGFQGCLASVDLN 

G\RLP\DLISDGSFSCNGTDSRRGMWKGPSTT\CQ 
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SEQIO 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCystcine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, (^Glycine, H=Histidine, 
I-Isoleiicine, K-Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, JR^Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
XKJnknown, *=Stop codon,/=possiblc nucleotide deletion, 
\=possible nucleotide insertion 










EDSCSNQGVCLQQWDGFSCDCSMTSFSGPLCND 

PGTTYIFSKGGGQITYKWPPNDRPSTRADRLAIGF 

STVQKEAVLVRVDSSSGLGDYLELHIHQGKIGVK 

FNVGTDDIAIEESNAIINDGKYHWRFTRSGGNA 

TLQVDSWPVIERYPAGRQLTIFNSQATIIIGGKEQ 

GQPFQGQLSGLYYNGLKVLNMAAENDANIATVG 

NVRLVGEVPSSMTTESTATAMQSEMSTSIMETTT 

TLATSTARRGKPPTKEPISQTTDDILVASAECPSD 

DEDIDPCEPSSGGLANPTRAGGREPYPGSAEVIRE 

,SSSTTG1^VGIVAAAALCILILLYAMYKYRNRDE 

GSYHVDESRNYISNSAQSNGAVVKEKQPSSAKSS 

NKNKKNKDKEYYV 


3502 


A 


394 


72 . 


KPAHLPFTVUMPKRKPSEGAMSDKVKA/KFELQ 
RRSAGLFSKPTPPKPETRPKKDPANQRQKLPKVR 
KGKADA/SKEGNSPAEERCSMVQTQKVEGWRSG 
SELPVALSF 


3503 


A 


43 


3358 


SGGRGPVRVRSEQLSPSAEQVSQISQISLGRRPLS 
SLPPPPSRALAPTRAPDTALTIMEVAEVESPLNPS 
CKIMTFRPSMEEFREFNKYLAYMESKGAHRAGL . 
AKVIPPK1EWKPRQC YDDIDNLLIPAPIQQM VTGQ 
SGLFTQYNIQKKAMTVKEFRQLANSGKYCTPRY 
LDYEDLERKYWKNLTFVAPIYGADINGSIYDEGV 
DEWNIARLNTVLDVVEEECGISIEGVNTPYLYFG 
MWKTTFAWHTEDMDLYSINYLHFGEPKSWYAIP 
PEHGKRLERLAQGFFPSSSQGCDAFLRHKMTLIS 
PSVLKKYGIPFDKITQEAGEFMITFPYGYHAGFN 
HGFNCAESTNFATVRWIDYGKVAKLCTCRKDM 
VKISMDIFVRKFQPDRYQLWKQGKDIYTIDHTKP 
: TPASTPEVKAWLQRRRKVRKASRSFQCARSTSK 
RPKADEEEEVSDEVDGAEVPNPDSVTDDLKVSE 
KSEAAVKLRNTEASSEEESSASRMQVEQNLSDHI 
KLSGNSCLSTSVTEDIKTEDDKAYAYRSVPSISSE 
ADDSIPLSTGYEKPEKSDPSELSWPKSPESCSSVA 
ESNGVLTEGEESDVESHGNGLEPGEIPAVPSGER 
NSFKVPSIAEGENKTSKSWRHPLSRPPARSPMTL 
VKQQAPSDEELPEVLSIEEEVEETESWAKPLIHL 
WQTKPPNFAAEQEYNATVARMKPHCAICTLLMP 
YHKPDSSNEENDARWETKLDEVVTSEGKTKPLIP 
EMCFIYSEENIEYSPPNAFLEEDGTSLLISCAKCC 
VRVHASCYGPSHEICDGWLCARCKRNAWTAEC 
CLC>n.RGGALKQTKNNKWAHVMCAVAVPEVR 
FTOVPERTQIDVGRIPLQRLKLKCIFCRHRVKRVS 
GACIQCSYGRCPASFHVTCAHAAGVL\MEPDDW 
PYVVNITCFRHKVNPNVKSKACEKVISVGQTVIT 
KHRNTRYYSCRVMAVTSQTFYEVMFDDGSFSRD 
TFPEDIVSRDCLKLGPPAEGEVVQVKWPDGKLY 
GAKYFGSNIAHMYQVEFEDGSQIAMKREDirTL 
DEELPKRVKARFVSAGRCHLGTCQVNSLSSPHVS 
QAQQETYLGFWINSKKSQCNIFLSGTY 


3504 


A . 


1124 


139 


RGEEQFDAEFRRFACLGFGERLQEFSRLLRAVHR 

SRAWTCYLAIRMLMATCCPSPTTTACTGPWQRA 

PPLRLLVQKREADSSGLAFASNSLQRRKKGLLLR 

PVAPLRTRPPLLISLPQDFRQVSSVIDVDLLPETH 

RRVRLHKHG SDRPLGFYIRDGMS VRVAPQG\LER 

VPGEFISRLVRGGLAESTGLLAVSDEILEVNGIEV 
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SEQID 
NO: 


Method 


Predicted 
beginning 
ntirlf*nf irip 

location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 

Infant inn 
corresponding 
to lost amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteinc, D»Aspartic Acid, 
E^Glutamic Acid, ^Phenylalanine, G=Glycine, H^Histidine, 

I=l*mlpnrinA It vein p f.=T^iirinf* M^IVTpfhinninp 

N^Asparagine, P=ProIine, Q=Glutamine, R*=Arginine, S^Serine, 
T=Threonine, V=Valine, W=Trypt0P nan » Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










AGKTLNQVTDMMVANSHN\LIVTVKPANQRNN 
VVRGASGRLTGPPSAGPGPAEPDSDDDSSDLVIE 
NRQPPSSNGLSQGPPCWDLHPGCRHPGTRSSLPS 
LDDQEQASSGWGSRIRGDGSGFSL 


3505 


A 


3 


2898 


SCRSATSQSGCGGGRSWLCSSLKMAAQPPRGIRL 

SALCPKFLHTNSTSHTWPFSAVAELIDNAYDPDV 

NAKQIMDKTVINDHICLTFTDNGNGMTSDKLH 

KMLSFGFSDKVTMNGHVPVGLYGNGFKSGSM\R 

LGKDAIVFTKNGESMSVGLLSQTYL\EV1KAEHV 

VWIVArmHRQMINLAESKASLAAILEHSLFSTE 

QKLLAELDAIIGKKGTRIUWNLRSYKNATEFDFE 

KDKYDIRIPEDLDEITGKKGYI<XQERMDQ1APES 

DYSLRAYCSILYLKPRMQIILRGQKVKTQLVSKS 

LAYIERDVYRPKFLSKTVRITFGFNCRNKDHYGI 

MMYHRNRLIKAYEKVGCQLRANNMGVGVVGn 

ECNFLKPTHNKQDFDYTNEYRLTTTALGEKLND 

YWNEMKVKKKIEYPLNLPVEDIQKRPD 

CDACLKWRKLPDGMDQLPEKWYCSNNPMDPQFR 

NCEVPEEPEDEDLVHPTYEKTYKKTNKEIGFRIRQ 

PEMIPRINAELLFRPT\ALSTPS\FSSPKESV SKR/RH 

.LSEGTNSYATRLLNNHQVPPQSEPESNSLKRRLS . 

TRSSILNAKNRRL\SSQF\ENS VYKG\DDDDEDV1I 

LEENSTPKPAVDHDIDMKSEQSHVEQGGVQVEF 

VGDSEPCGQTGSTSTSSSRCDQGNTAATQTEVPS 

LVVKKEETVEDEIDVRNDAmPSCVEAEAKIHE 

TQETTDKSADDAGCQLQELRNQLLLVTEEKENY 

KRQCHMFTOQIKVLQQRILEMNDKYVKKETCH 

QSTETDAVFLLESINGKSESPDHMVSQYQQALEE 

IERLKKQCSALQHVKAECSQCSNNESKSEMDEM . 

AVQLDDVFRQLDKCSBERDQYKSEVELLEMEKS 

QIRSQCEELKTEVEQLKSTNQQTATDVSTSSNIEE 

SVNHMDGESLKLRSLRVNVGQLLAMIVPDLDLQ 

QVNYDVDYVDEILGQWEQMSEISST 


3506 


A 


2 


2120 


RPPEAGGRYRAGGRRQAAKPSRPPLPSRRRLPQG 

GRTRRAMDRPAAAAAAGCEGGGGPNPGPAGGR 

RPPRAAGG ATAG SRQPS VETLDSPTGSHVE WCK 

QLIAATISSQISGSVTSENVSRDYKALRDGNKLA 

QMEEAPLFPGESIKAIVKDVI^CPFMGAVSGTL 

TVTDFKLYFKNVERDPHFILDVPLGVISRVEKIGA 

QSHGDNSCGIEIVCKDMRNLRLAYK\QEEQSKLG 

IFENLNKHAFPLSNGQALFAFSYKEKFPINGWKV 

YDPVSEYKRQGLPNESWKISKINSNYEFCDTYPA 

nWPTSVKDDDLSKVAVFLAKGRVPVLSWIHPE 

SQAHTRCSQPLVGPNDKRCKEDEKYLQTIMDAN 

AQSHKLIIFDARQNS VADTNKTKGGGYESESAYP 

NAELVFLEIH>nHVMRESLRKLKErVYPSIDEARW' 

LSNVDGTHWLEYIRMLLAGAVRIADKIESGKTSV 

WHCSDGWDRTAQLTSLAMLMLDSYYRTIKGFE 

TLVEKEWISFGHRFALRVGHGNDNHADADRSPIF 

LQFVDCVWQMTRQFPSAFEFNELFLITELDHLYS 

CLFGTFLCNCEQQRFKEDVYTKTISLWSYINSQL 

DEFSNPFFVNYENHVLYPVASLSHLELWVNYYV 

RWNPRMRPQMPIHQNLKELLAVTIAELQKRVEG 

LQREVATRAVSSSSERGSSPSHFATSVHTLV 


3507 


A 


1 


2169 


GSSIKJRLTN^CAKNIAKKDFFRO 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine OCystcinc, 0=Aspartic Acid, 
E^GIutamic Acid, ^Phenylalanine, G^GIycine, H=Histidine, 
I-Isoleucine, KHLysine, L^Leucine, M«Methionine, 
N=Asparagine, PHProline, Q^Glutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W«=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop cod on, /^possible nucleotide deletion, 
V=possible nucleotide insertion 


- 








GSGQCHSTDTVKNTLDPKWNQHYDLYVGKTDSI 

TISVWNHKKIHKKQGAGFLGCVRLLSNATSRLKD 

TGYQRLDLCKLNPSDTDAVRGQIWSLQTRDRIG 

TGGSVVDCRGLLENEGTVYEDSGPGRPLSCFME 

EPAPYTDSTGAAAGGGNCRFVESPSQDQRLQAQ 

RLR3STPDVRGSLQTPQNRPHGHQSPELPEGYEQRT 

TVQGQVYFLHTQTGVSTWHDPRIPRDLNSVNCD 

ELGPLPPGWEVRSTVSGRIYFVDHNNRTTQFTDP 

RLHHIMNHQCQLKEPSQPLPLPSEGSLEDEELP A 

QRYERDLVQKLKVLRHELSLQQPQAGHCRIEVS 

REEIFEESYRQIMKMRPKDLKKRLMVKFRGEEG 

LDYGGVAREWLYLLCHEMLNPYYGLFQYSTONI 

YMLQDSfPDSSINPDHLSYFHFVGRIMGLAVFHGH 

YINGGFTVPFYKQLLGKPIQLSDLESVDPELHKSL 

VWILEND1TPVLDHTFCVEHNAFGRILQHELKPN 

G\RNWVTEENKKEYVRLYVNWRFMRGIEAQFL • 

ALQKGFNELIPQHLLKPFDQKELELIIGGLDKIDL 

NDWKSNTRLKHCVADSNIVRWFWQAVETFDEE 

RRARLLQFVTG STRVPLQGFKALQGSTGVAAGPR 

LFTIHLIDANTDNLRKAHTCFNRJDIPPYESYEKL . 

YEKLLTAVEETCGFAVE 


3508 


A 


3 


6388 


ILYINPADLGWNPPVSSWIEKREIQTERANLTILF 
DKYLPTCLDTLRTRFKKIIPIPEQSMVQMVCHLLE 
CLLTTEDIPADCPKEIYEHYFVFAAIWAFGGAMV 
QDQLVDYRAEFSKWWLTEFKTVKFPSQGTIFDY 
YIDPETKKFEPWSKLVPQFEFDPEMPLQACLVHT 
SETIRVCYFMERLMARQRPVMLVGTAGTGKSVL 
VGAKLASLDPEAYLVKNVPFNYYTTSAMLQA VL 
EKPLEKKAGRNYGPPGNKKLIYFmDMNMPEVD 
AYGTVQPHTIIRQHLDYGHWYDRSKLSLKEITNV 
QYVSCMNPTAGSFTINPRLQRHFSVFVLSFPGAD 
ALSSIYSIILTQHLKLGNFPASLQKSIPPLIDLALAF 
HQKIATTFLPTGIKFHYIFNLRDFANIFQGILFSSV 
ECVKSTWDLIRLYLHESimVYRDKMVEEKDFDL 
FDKIQTEVLKKTFDDIEDPVEQTQSPNLYCHFAN 
GIGEPKYMPVQSWELLTQTLVEALENHNEVNTV 
MDLVLFEDAMRHVCHENRILESPRGNALL VGVG 
GSGKQSLTRLAAFISSMDVFQITLRKGYQIQDFK 
MDLASLCLKAGVKNLNTVFLMTDAQVADERFL 
VLINDLLASGEIPDLYSDDEVENIISNVRNEVKS'Q 
GLVDNRENCWKFFIDRIRRQLKWLCFSPVGNKL 
RVRSRKFPAIVNCTAIHWFHEWPQQALESVSLRF 
LQNTEGIEPTVKQSISKFMAFVHTSVNQTSQSYLS 
NEQRYNYTTPKSFLEFIRLYQSLIJHRHRKELKCK . 
TERLENGLLKLHSTSAQVDDLKAKLAAQEVELK 
QKNEDADKLIQVVGVETDKVSREKAMADEEEQ 
KVAVIMLEVKQKQKDCEEDLAKAEPALTAAQA 
ALOTLMCTNLTELKSFGSPPLAVSNVSAAVMVL 
MAPRGRVPKDRSWKAAKVTMAKVDGFLDSLIN 
raKENIHENCLKAIRPYLQDPEFNPEFVATKSYA 
AAGLCSWVINIVRFYEVFCDVEPKRQALNKATA 
DLTAAQEKLAAIKAKIAHLNENLAKLTARFEKA 
TADKLKCQQEAEVTAVTISLANRLVGGLASENV 
• RWADAVQNFKQQERTLCGDILLITAFISYLGFFT 
KKYRQSLLDRTWRPYLSQLKTPIPVTPALDPLRM 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIaninc 0=Cystcine, D-Aspartic Acid, 
IMJIutamic Acid, F«Phenylalanine, G=Glycine, H=Histidine, 
Msoleucine, K=Lysine, D=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R*=Arginine, S=Serine, 
T=Ttareonine, V«Valine, W=Tryptophan, Y^Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










LMDDADVAAWQNEGLPADRMSVENATILINCE 

RWPLMVDPQLQGIKWIKNKYGEDLRVTQIGQKG '. 

YLQIIEQALEAGAWLIENLEESIDPVLGPLLGRE 

\HKKGRFIKIGDKECEYNPKFRLILHTKLAWHYQ 

PELQAQATLINFTVTRDGLEDQLLAAVVSMERP 

DLEQLKSDLTKQQNGFKITLKTLEDSLLSRLSSAS 

GNFLGETVLVENLEITKQTAAEVEKKVQEAKVT 

EVKINEAREHYRPAAARASLLYFIMKDLSKIHPM 

YQFSLKAFSIVFQKAVERAAPDESLRERVANLID 

SITFSVYQYTIRGLFECDKLTYLAQLTFQILLMNR 

EVNAVELDFLLRSPVQTGTASPVEFLSHQAWGA 

VKVLSSMEEFSNLDRDIEGSAKSWKKFVESECPE 

KEKLPQEWK>JKTALQRLCMLRAMRPDRMTYAL 

RDFVEEKLGSKYVVGRALDFATSFEESGPATPMF 

FILSPGVDPLKDVESQGRKLGYTFNNQNFHNVSL 

GQGQEWAEAALDLAAKKGHWVILQNTLEMCS 

RETEFKS3LFALCYFHAVVAERRKFGPQGWNRSY 

PFNTGDLTISVNVLYNFLEANAKVPYDDLRYLFG 

EIMYGGHITDDWDRRLCRTYLGEFIRPEMLEGEL 

SLAPGFPLPGNMDYNGYHQYIDAELPPESPYLYG 

LHPNAEIGFLTQTSEKLFRTVLELQPRDSQARDG 

AG ATREEKVKALLEEILERVTDEFNIPELMAK VE 

ERTPYTVVAFQECGRMNILTREIQRSLRELELGLK 

GELTMTSHMENLQNALYFDMVPESWARRAYPS 

TAGLAAWFPDLLNRIKELEAWTGDFTMPSTVWL 

TGF1WQSFLTAJMQSTAJRKNEWPLDQN1ALQCD 

MTKKNREEFRSPPREGAYIHGLFMEGACWDTQA 

GnTEAKLKDLTPPMPVMFIKAEPAD\RQDCGHVY 

SCPVTKTSQVRDPTYVWTFNLKTKENPSKWVLA 

GVALLLQI 


3509 


A 


3 


6388 


ILYINPADLGWNPPVSSWIEKREIQTERAKLTILF 

DKYLPTCLDTLRTRFKKIPIPEQSMVQMVCHLLE 

CLLTTEDIPADCPKEIYEHYFVFAAIWAFGGAMV 

QDQLVDYRAEFSKWWLTEFKTVKFPSQGTIFDY 

YIDPETKKFEPWSKLVPQFEFDPEMPLQACLVHT 

SEmVCYFMERLMARQRPVMLVGTAGTGKSVL 

VGAKLASLDPEAYLVKNVPFNYYTTSAMLQAVL 

EKPLEKKAGRNYGPPGNKKLIYFIDDMNMPEVD 

AYGTVQPHTnRQHLDYGHWYDRSKLSLKEITNV 

QYVSCMNPTAGSFTINPRLQRHFSVFVLSFPGAD 

ALSSIYSHLTQHLKLGNFPASLQKSIPPLIDLALAF 

HQKIATTFLPTGIKFHYIFNLRDFAMFQGILFSSV 

ECVKSTWDLIRLYLHESNRVYRDKMVEEKDFDL 

FDKIQTEVLKKTFDDIEDPVEQTQSPNLYCHFAN 

GIGEPKYMPVQSWELLTQTLVEALENHNEVNTV 

MDLVLFEDAMRHVCHINRILESPRGNALL VGVG 

GSGKQSLTRLAAFISSMDVFQITLRKGYQIQDFK 

MDLASLCLKAGVKNLNTVFLMTDAQVADERFL 

VLINDLLASGEIPDLYSDDEVENIISNVRNEVKSQ 

GLVDNRENCWKFFIDRJRRQLKVTLCFSPVGNKL 

RVRSRKFPAIVNCTAIHWFHEWPQQALESVSLRF 

LQNTEGIEPTVKQSISKFMAFVHTSVNQTSQSYLS 

NEQRYNYTTPKSFLEFIRLYQSLLHRHRKELKCK 

TERLENGLLKLHSTSAQVDDLKAKLAAQEVELK 

QKOTDADKLIQVVGVETDKVSREKAMADEEEQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E^Glutamic Acid, ^Phenylalanine, G=Glycine, H-Histidine, 
I^Isoleucine, K^Lysine, L^Leucine, M— Methionine, 
N=Asparaginc, P*=Proline, Q=Glutaminc, R«Arginine, S^Serine, 
T=Threonine, V«Valine, W=Tryptophan, Y«=Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










KVAVIMLEVKQKQKDCEEDLAKAEPALTAAQA 

ALNTLNKTNLTELKSFGSPPLAVSNVSAAVMVL 

MAPRGRWKDRSWKAAKVTMAKVDGFLDSLTN 

F>OCEMHENCLKAIRPYLQDPEFNPEFVATKSYA 

AAGLCSWVINIVRFYEVFCDVEPKRQALNKATA 

DLTAAQEK1AAIKAKIAHLNENLAKLTARFEKA 

TADKLKCQQEAEVTAVTISLANRLVGGLASENV 

RWADAVQNFKQQERTLCGDELLITAFISYLGFFT 

KKYRQSLLDRTWRPYLSQLKTPIPVTPALDPLRM 

LMDDADVAAWQNEGLPADRMSVENAT1LINCE 

RWPLMVDPQLQG1KWIKNKYGEDLRVTQIGQKG 

YLQIIEQALEAGAWLIENLEESIDPVLGPLLGRE 

VIKKGRFIKIGDKECEYNPKFRLILHTKLANPHYQ 

PELQAQATLINFTVTRDGLEDQLLAAVVSMERP 

DLEQLKSDLTKQQNGFKITLKTLEDSLLSRLSSAS 

GNFLGETVLVENLEITICQTAAEVEKKVQEAKVT 

EVKINEARErT/RPAAARASLLYFIMNDLSKIHPM 

YQFSLKAFSIVFQKAVERAAPDESLRERVANLID 

SITFSVYQYTIRGLFECDKLTYLAQLTFQILLMNR 

EVNAVELDFLLRSPVQTGTASPVEFLSHQAWGA 

VKVLSSMEEFSNLDRDIEGSAKSWKKFVESECPE 

KEKLPQEWKNKTALQRLCMLRAMRPDRMTYAL 

RDFVEEKLGSKYWGRALDFATSFEESGPATPMF 

FILSPGVDPLKDVESQGRKLGYTFNNQNFHNVSL 

GQGQEWAEAALDLAAKKGHWVILQNTLEMCS 

RETEFKSILFALGYFHAVVAERRKFGPQGWNRSY 

PFNTGDLTI S VNVLYNFLEAN AKVPYDDLRYLFG 

EIMYGGHITDDWDRRLCRTYLGEFIRPEMLEGEL 

SLAPGFPLPGNMDYNGYHQYIDAELPPESPYLYG 

LHPNAEIGFLTQTSEICLFRTVLELQPRDSQARDG 

AGATREEKVKALLEEILERVTDEFNIPELN4AKVE . 

ERTPYIW AFQECGRMNILTREIQRSLRELELGLK 

GELTMTSHMENLQNALYFDMVPESWARRAYPS 

TAGLAAWFPDLLNRIKELEA WTGDFTMPSTV WL 

TGFFNPQSFLTAIMQSTARKNEWPLDQMALQCD 

MTECKNREEFRSPPREGAYfflGLFMEGACWDTQA 

GUTEAKLKDLTPPMPVMFIKAIPAD\RQDCGHVY 

SCPVTKTSQ\RDPTYVWTFmKTKENPSKWVLA 

GVALLLQI 


3510 


A 


390 


3330 


AAGSGSRPPAPAARKMADLAECNIKVMCRFRPL 

NESEVNRGDKYIAKFQGEDTVVIASKPYAFDRVF 

QSSTSQEQVYNDCAKKIVKDVLEGYNGTIFAYG 

QTSSGKTHTN4EGKXHDPEGMGIIPRIVQDIFNYIY 

SMDENLEFHnCVSYFEIYLDKIRDLLDVSKTOLSV 

HEDKNRVPYVKGCTERFVCSPDEVMDTIDEGKS . 

NRHVAVTNMNEHSSRSHSIFLINVKQENTQTEQK 

LSGKLYLVDLAGSEKVSKTGAEGAVLDEAKMN 

KSLSALGNVISALAEGSTYVPYRDSKMTRJLQDS 

LGGNCRTTIVICCSPSSYNESETKSTLLFGQRAKTI 

KNWCVNVELTAEQ\\lCKKYEKJEKEKNKILRN'ri 

QWLENELNRWRNGETVPIDEQFDKEKANLEAFT 

VDKDITLTNDKPATAIGVIGNFTDAERRKCEEEIA 

KLYKQLDDKDEEINQQSQLVEKLKTQMLDQEEL 

LASTRRDQDNMQAELNRLQAENDASKEEVKEV 

•LQALEELAVNYDQKSQEVEDKTKEYELLSDELN 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteinc, D=Aspartic Acid, 
E=K*Iutamic Acid, F=Phenyl alanine, G^Glycine, H-Histidine, 
I=Isoleucine, K«Lysine, L=Leucine, M«Methionine, 
N=Asparagine, P=ProIine, Q=Glii famine, R=Arginine, S=Serioe, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *«Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










QKSATLASIDAELQKLKEMThfflQKKRAAEMMA 
SLLKDLAEIGIAVGNNDVXQPEGTGMIDEEFTVA 
RLY1SKMKSEVKTMVKRCKQLESTQTESNKKME 
ENEKELAACQLRJSQHEAKIKSLTEYLQNVEQKK 
RQLEESVDALSEELVQLRAQEKVHEMEKEHLNK 
VQTANEVKQAVEQQIQSHRETHQKQISSLRDEVE 
AKAKLITDLQDQNQKMMLEQERLRVEHEKLKA 
TDQEKSRKLHELTVMQDRREQARQDLKGLEETV 
AKELQTLHNLRKLFVQDLATRVKKSAEIDSVDDT 
. GGSAAQKQKISFLENNLE\QLTKSAQTSWYRDNA 
DLRCELPKLEKRLRATAERVKALESALKEAKEN 
ASRDRKRYQQEVDRIKEAVRSKNMARRGHSAQI 
AKPBRPGQHPAASPTHPSAIRGGGAFVQNSQPVA 
VRGGGGKQV 


3511 


A 


1 


1757 


MASVQASRRQWCYLCDLPKMPWAMVWDFSEA 

VCRGCVNFEGADRIELLIDAARQLKRSHVLPEGR 

SPGPPALKHPATKDLAAAAAQGPQLPPPQAQPQP 

SGTGGGVSGQDRYDRATSSGRLPLPSPALEYTLG 

SRLANGLGREEAVAEGARRALLGSMPGLMPPGL 

LAAAVSGLGSRGLTLAPGLSPARPLFGSDFEKEK 

QQRNADCLAELNEAMRGRAEEWHGRPKAVREQ 

LLALSACAPFNVRFKICDHGLVGRVFAFDATARP 

PGYEFELKLFTEYPCGSGNVYAGVLAVARQMFH 

DALREPGKALASSGFKYLEYERRHGSGEWRQLG 

ELLTDGVRSFREPAPAEALPQQYPEPAPAALCGP 

PPRAPSRNLAPTPRRRKASPEPEGEAAGKMTTEE 

-QQQRHWVAPGGPYSAETPGVPSPIAALKNVAEA 

LGHSPKDPGGGGGPVRAGGASPAASSTAQPPTQ 

HRLVAKNGEAEVSPTAGAEAVSGGGSGTGATPG 

APLC\CTLCRERLEDTHFVQ\CPPVPEHKFCFPCSR 

KFIKAQGPAGEWYCPSGDKCPLVGSSVPWAFMQ 

GEIATILAGDIKVKKERDP 


3512 . 


A . 


3 


1994 


NTNSSSVTNSAAGVEDLNWQVTVPDNEKERLSS 

JEKIKQLREQVNDLFSRKFGEAIGVDFPVKVPYR 

KITFNPGCVVIDGMPPGVYFKAPGYLEISSMRRIL . 

EAAEFIKFTVIRPLPGLELSNGEYSTVGKRKIDQE 

GRVFQEKWERAYFFVEVQNISTCLICKRSMSVSK 

EYNLRRHYQTTOTSKHYDQYMERMRDEKLHELK 

KGLRKYLLGLSDTECPEQKQVFANPSPTQKSPVQ 

PVEDLAGNLWEKLREKJRSFVAYSIAIDEITDINN 

TTQLAIFIRGVDENFDVSEELLDTVPMTGTKSGN 

EIFSRVEKSLKNFCINWSrCLVSVASTGTPPMVDA 

NNGLVTKLKSRVATFCKGAELKSICCIIHPESLCA 

Q\KLKMDHVMDWVKSVNWICSRGLNHSEFTTL 

LYELDSQYGSLLYYTEIKWLSRGLVLKRFFESLE 

EIDSFMSSRGKPLPQLSSIDWIRDLAFLVDMTMH 

LNALNISLQGHSQTVTQMYDLIRAFLAKLCLWET 

HLTRNNLAHFPTLKLVSRNESDGLNYIPKJAELK 

TEFQKRLSDFKLYESELTLFSSPFSTKIDSVHEELQ 

MEVTOLQC^TVLKTKYDKVGIPEFYKYLWGSYP 

KYKHHCAKILSMFGSTYICEQLFSIMKLSKTKYC 

SQLKDSQWDSVLHIAT 


3513 


A 


1836 


513 


FKSLLSVKWFCFSILVLIFLGTRCYWEMTQSRPSP 
DPHRGRWEGGRSRPKGGEEGRRRTRVPGL VTAS 
GPGNPLPDRLGEMAGGRHRRWGTLHLLLLVAA 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to iast amino 
acid residue of 
peptide 
sequence 


Amino acid scquence.(A~Alanine OCysteine, D^Aspartic Acid, 
E^Glutamic Acid, ^Phenylalanine, G=Glycinc, H=Histidine, 
I—IsoJeucine, K— Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R«Arginine, S^Serine, 
T^Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=possibIe nucleotide deletion, 
\=possible nucleotide insertion 


- 








LPWASRGVSPSASAWPEEKNYHQPAILNSSALRQ 

IAEGTSISEMWQNDLQPLLIERYPGSPGSYAARQ 

HIMQRIQRLQADWVLEEDTFLSQTPYGYRSFSNII 

STLNPTAKRHLVLACHYDSKYFSHWWNRVFVG 

ATDSAVPCAMMLELARALDKKLLSLKTVSDSKP 

DLSLQLIFFDGEEAFLHWSPQDSLYGSRHLAAKM 

ASTPHPPG ARGTSQLHGMDLLVLLDLIGAPNPTP 

PNFFPNSARWFERLQAIEHELHELGLLKDHSLEG 

RYFQNYSYGGVIQDDHIPFLRRGVPVLHLIPSPFP 

EVWHTMDDNEENLDESTIDNLNKILQVFVLEYL 

HL ■ . 


3514 


A 


1836 


513 


FKSLLSVKWFCFSILVLIFLGTRCYWEMTQSRPSP 

DPHRGRWEGGRSRPKGGEEGRRRTRVPGLVTAS 

GPGNPLPDRLGEMAGGRHRRWGTLHLLLLVAA 

LPWASRGVSPSASAWPEEKNYHQPAILNSSALRQ 

IAEGTSISEMWQNDLQPLLIERYPGSPGSYAARQ 

HIMQRIQRLQADWVLEIDTFLSQTPYGYRSFSNn 

STLNPTAKRHLVLACHYDSKYFSHWWNRVFVG 

ATDSAVPCAMMLELARALDKKLLSLKTVSDSKP 

DLSLQLIFFDGEEAPLHWSPQDSLYGSRHLAAKM 

ASTPHPPG ARGTSQLHGMDLLVLLDLIG APNPTF 

PNFFPNSARWFERLQAIEHELHELGLLKDHSLEG 

RYFQNYSYGGVIQDDHIPFLRRGVPVLHLPSPFP 

EVWHTMDDNEENLDESTIDNLNKILQVFVLEYL 

HL , 


3515 


A 


114. 


754 


LCRDLTTTMSSKRTKTKTKKRPQRATSNVFAMF 

DQSQIQEFKEAFNMIDQNRDGFIDKEDLHDMLAS . 

LGKNPTDEYLDAMMNEAPGPINFTMFLTMFGEK 

LNGTDPEDVIRNAFACFDEEATGTIQEDYLRELL 

TT\MGDRF\TDE\EVDELYREAP1\DKKGGIFNY1\E 

FTRHLETGGPKDKDDRKJTFQIPSPNVPWLATFG 

VFLEIFLLHGP - • - - 


3516 


A 


1 


5169 


MAAAPSALLLLPPFPVLSTYRLQSRSRPSAPETDD 

SRVGGIMRGEKOTYFRGAAGDHGSCfTTTSPLA 

SALLMPSEAVSSSWSESGGGLSGGDEEDTRLLQL 

LRTARDPSEAFQALQAALPRRGGRLGFPRRKEAL 

YRALGRVLVEGGSDEKRLCLQLLSDVLRGQGEA 

GQLEEAFSLALLPQLVVSLREENPALRKDALQIL 

HICLKRSPGEVLRTLIQQGLESTDARLRASTALLL 

PILLTTEDLLLGLDLTEVHSLARKLGDQETEEESE 

TAFSALQQIGERLGQDRFQSYISRLPSALRRHYN 

RRLESQFGSQVPYYLELEASGFPEDPLPCAVTLS 

NSNLKFGIIPQELHSRLLDQEDYKNRTQAVEELK 

QVLGKFNPSSTPHSSLVGFISLLYNLLDDSNFKW 

HGTLEVLHLLVIRLGEQVQQFLGPVIAASVKVLA 

DNKLVIKQEYMKIFLKLMKEVGPQQVLCLLLEH 

LKHKHSRVREEVVNICICSLLTYPSEDFDLPKLSF 

DLAPALVDSKRRVRQAALEAFAVLASSMGSGKT 

SILFKAVDTVELQDNGDGVMNAVQARLARKTLP 

RLTEQGFVEYAVLMPSSAGGRSNHLAHGADTD 

WLLAGNRTQSAHCHCGDHVRDSMHIYGSYSPTI 

CTRRVLSAGKGICNKLPWENEQPGIMGENQTSTS 

KDIEQFSTYDFIPSAKLKLSQGMPVNDDLCFSRK 

RVSRNLFQNSRDFNPDCLPLCAAGTTGTHQTOLS 

GKCAQLGFSQICGKTGSVGSDLQFLGTTSSHQEK 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc OCysteine, D=Aspartic Acid, 
E=Gtutamic Acid, ^Phenylalanine, G^GIycine, H-Histidine, 
I=Isoleucinc, lC—Lysine, L^Leucine, M—MethioniiiCi 
N^Asparagine, P^ProIine, Q=Glutamine, R«Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, **=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 




■> 






VYASLNFGSKTQQTFGSQTECTSSNGQNPSPGAY 
ILPSYPVSSPRTSPKHTSPLIISPKKSQDNSVNFSNS 
WPLKSFEGLSKPKSHRRSLSAQKSSVDPTGRVNHG 
VENSQEKPPWQLTPALWRSPSSRRGLNGTXPVPPI 
P\RGISLLPDKADLSTVGHKKKEPDDTWKCEKDS 
LPIDLSELNFKDKDLDQEEMHSSLRSLRNSAAKK 
RAKLSGSTSDLESPDSAMKLDLTMDSPSLSSSPNI 
NSYSESGWSQESLTSSLSTTPQGKRIMSDIFPTFG 
SKPCPTRLSSAKKKJSHIAEQSPSAGSSSNPQQISS 
FDFTTTK1ALSEDSVVVVGKGVFGSLSSAPATCSQ 
SVISSVENGDTFSIKQSIEPPSGIYGRSVQQNISSYL 
DVENEKDAKVSISKSTYNKMRQKRKEEKELFHN 
KX)CEKKEKNSWERMRHTGTEKMASESETTTGAI 
SQYKERMPSVTHSPEIMDLSELRPFSKPE1ALTEA 
LRLLADEDWEKKIEGLNFIRCLAAFHSEILNTKL 
HETOFAVVQEVKNLRSGVSRAAVVCLSDLFTYL 
KKSMDQELDTTVKVLLHKAGESNTFIREDVDKA 
LRAMVNNVTPARAWSLINGGQRYYGRJCMLFF 
MMCHPNFEKMLEKWPSKDLPYIKDSVRNLQQK 
GLGEIPLDTPSAKGRRSHTGSVGNTRSSSVSRDA 
FNSAERAVTEVREVTRKSVPRNSLESAEYLKLIT 
GLLNAKDFRDRINGIKQLLSDTENNQDLVVGNIV 
KIFDAFKSRLHDSNSKVNLVALETMHKMIPLLRD 
HLSPIINMLIPMVDNNLNSKlsfPGIYAAATNVVQA 
LSQHVDNYLLLQPFCTKAQFLNGKAKQDMTEKL 
- ADIVTELYQRKPHATEQKVLWL WHLLGNMTN 
SGSLPGAGGNIRTATAKLSKALFAQMGQNLLNQ 
AASQPPH1KKSLEELLDMTILNEL 


3517 


A 


1449 


252 


QDLKPVLDREYLAIYLKMVFFTCNACGESVKKI 

QVEKHVSVCRNCECLSCIDCGKDFWGDDYKNH 

VKCISEDQKYGGKGY/EKVKTHKGD/ASKQQAW 

IQKISELIK\RPNVSPKVRELLEQISAFDNVPQ\KK . 

AKFQNWMKNSLKVHNESILDQVWNIFSEASNSE 

PVNKEQDQRPLHPVANPHAEISTKVPASKVKDA 

VEQQGEVKJCNKilERXEERQKKRKREKKELKLE ~ 

^QENSRNQKPKKRKKGQEADLEAGGEEVPEA 

NGSAGKRSIOCKXQRKDSASEEEARVGAGKRKR 

RHSKVETDSKKKKMKLPEHPEGGEPEDDEAPAK 

GKflWKGTIKAILKQAPDNEITIKKLRKKVLAQY 

YTVTDEHHRSEEELLVIrl^KKISPCNPTFKLLKDK 

VKLVK 


3518 


A 


3 


635 


APDSNARNDHFDACSLRVQAGLSSAGPALGNSG 

LAALMASPSKAVIVPGNGGGDVTTHGWYGWVK 

KELEKPGFQCLAKNMPDPITARESIWLPFMETEL^ 

HCDEKTinGHSSGAIAAMRYAETHRVYAIVLVSA 

YTSDLGDENERASGYFTRPWQWEKJKANCPYTV 

QFGSTDDPFLPWKEQQEVAD\SWKPNCTNSLTV 

ATFRTQSFMN 


3519 


A 


81 


2277 


VRETRREMAMAMSDSGASRLRRQLESGGFEARL 

YVKQLSQQSDGDRDLQEHRQRIQALAEETAQNL 

KRNVYQNYRQFIETAREISYLESEMYQLSHLLTE 

QKSSLESIPLTLLPAAAAAGAAAASGGEEGVGGA 

GGRDHLRGQAGFFSTPGGASRDGSGPGEEGKQR 

TLTTLLEKVEGCRHLLETPGQYLVYNGDLVEYD 

ADHMAQLQRVHGFLMNDCLLVATWLPQRRGM 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide ; 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D-A$partic Acid, 
£«Glutamic Acid, F=Phcnylalanine, G=Glycine, H=Histidine, 
I^Isoleucine, K^Lysine, Lr=Leucine, M^Methionine, 
N^Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y-Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
^possible nucleotide insertion 










YRYNALYSLDGLAVVNVKDNPPMKDMFKLLMF 

PENRIFQAENAKIKREWLEVLEDTKRALSEKEIRR 

EQEEAAAPRGPPQVTSKATNPFEDDEEEEPAVPE 

VEEEKVDLSMEWIQELPEDLDVC1AQRDFEGAV 

DLLDKLNHYLEDKPSPPPVKELRAKVEERVRQL 

TEVLVFELSPDRSLRGGPKATRRAVSQLIRLGQC 

TKACELFLRNRAAAVHTAIRQLRIEGATLLYIHK 

LCHVFFTSLLETAREFEIDFAGTDSGCYSAFVVW 

ARSAMGMFVDAFSKQVFDSKESLSTAAECVKVA 

KEHCQQLGDIGLDLTFnHALLVKDIQGALHSYK 

EHIEATKHRNSEEMWRRMNLMTPEALGKLKEE 

MKSCGVSNFEQYTGDDCWVNLSYTVVAFTKQT 

MGFLEEALKLYFPELHMVLLESLVEIILVAVQHV 

DYSLRCEQDPEKKAFIRQNASFLYETVL\PVVEK 

RFEEGVGKPAKQLQDLRNASRLIRVNPESTTSVV 


3520 


A " 


1706 


540 


FVAHLAWPWRADGDMEDGVLNEGFLVKRGHIV 

HNWKARWFILRQNTLVYYKLEGGRRVTPPKGRI 

LLDGCTITCPCLEYENRPLLIKLKTQTSTEYFLEA 

CSREE/RRDAWAFEMTGAIHAGQARGKVQQLHS 

LRNSFKLPPfflSLHRIVDKMHDSNTGIRSSPNMEQ 

G STYKKTFLGS SLVDWLISNSFTASRLEAVTLAS 

MLMEENFLRPVGVRSMGAIRSGDLAEQFLDDST 

ALYTFAESYKKK1SPKEEISLSTVELSGTVVKQGY 

LAKQGmOUO^CVRRFVLRKDPAFLHYYDPSK 

EENRPVGGFSLRGSLVSALEDNGVPTGVKGNVQ 

GNLFKVITK\DDTHYYIQA\SSKAE\RAE\WIGSLS 

KSLNMNKDPEGTPDSLPSLPR 


3521 


A 


3 


3063 


HASVSLSLGCPRPCADTPGPQPQPMDLRVGQRPP 

VEPPPEPTLLALQRPQRLHHHLFLAGLQQQRSVE 

PMRVKMELPACGATLSLVPSLPAFSIPRHQSQSST 

PCPFLGCRPCPQLSMDTPMPELQEAPQEQELRQL 

LHKDKSKRSAVASSVVKQKLAEVILKKQQAALE 

RTVHPNSPGIPYRTLEPLETEGATRSMLSSFLPPV 

PSLPSDPPEHFPLRKTVSEPNLKLR\TCPKKSLERR 

KNPLLRKESAPPSLRRRPAETLGDSSPSSSSTPAS 

GCSSPNDSEHGPNPTLGSEALLGQRLRLQETSVAP 

FALPTVSLLPAITLGLPAPARADSDRRTHPTLGPR 

GPILGSPHTPLFLPHGLEPEAGGTLPSRLQPILLLD 

PSGSHAPLLTVPGLGPLPFHFAQSLMTTERLSGSG 

LHWPLSRTRSEPLPPSATAPPPPGPMQPRLEQLKT 

HVQVIKRSAKPSEKPRLRQIPS AEDLETDGGGPG 

QVVDDGLEHRELGHGQPEARGPAPLQQHPQVLL 

WEQQRLAGRLPRGSTGDTVLLPLAQGGHRPLSR 

AQSSPAAPASLSAPEPASQARVLSSSETPARTLPF 

TTGLIYDSVMLKHQCSCGDNSRHPEHAGRIQSIW 

SRLQERGLRSQCECLRGRKASLEELQSVHSERHV 

LLYGTNPLSRLKLDNGKLAGLLAQRMFVMLPCG 

G V GVDTDTI WNELHS SNA AR W AA G S VTDL AFK 

VASRELKNGFAVVRPPGHHADHSTAMGFCFFNS 

VAIACRQLQQQSKASKILIVDWDVHHGNGTQQT 

FYQDPSVLYISLHRHDDGNFFPGSGAVDEVGAGS 

GEGFNVNVAWAGGLDPPMGDPEYLAAFRIWM 

PIAREFSPDLVLVSAGFDAAEGHPAPLGGYHVSA 

KCFGYMTQQLMNLAGGAVVLALEGGHDLTAIC 

DASEACVAALLGNRVDPLSEEGWKQKPNLNAIR 



369 



WO 01/57190 



PCT/USO 1/04098 



SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide* 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide { 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCystcinc, D=Aspartic Acid, 
&=Glutamic Acid, F*=Phenylalanine, G= Glycine, H^Histidine, 
I^Isolcucine, K— Lysine, L^Leucinc, M=Methionine, 
N^Asparagine, P=Proline» Q^GIutaminc, R«=Arginiiie, S=Serinc, 
T=Threonine, V=VaIine, i W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










SLEAW1RVHSKYWGCMQRLASCPDSWVPRVPG 
ADKEEVEAVTALASLSVGILAEDRPSEQLVEEEE 
PMNL 


3522 


A 


9 


602 


KMAALGEPVRLERDICRAIELLEKLQRSGEVPPQ 
KLQ ALQRVLQSEFCNAVREV YEHV YETVDI S S SP 
EVRANATAKATVAAFAASEGHSHPRVVELPKTE 
EGLGFNIMGGKEQNSPIYISRIIP/GGIADRHGGLK 
RGDQLLSVNGVSVEGEHHEKAVELLKAAQGKV 
KLWRYTPKVLEEMESRFEKMRSAKRRQQT 


3523 


A 


645 


1465 


IMAETSLLEAGASAASTAAALENLQVEASCSVCL 

EYLKEPVIIECGHNFCKACITRWWEDLERDFPCP 

VCRKTSRYRSLRPNRQLGSMVEIAKQL\RPSSGRS 

GMRASAPQHHEALSLFCYEDQEAVCLICAISHTH 

RAHTWPLDDATQEYKEKLQKCLEA\LNQKLQEI 

TRCKSSEEKKPGELKRLVESRRQQILREFEELHRR 

LDEEQQVLLSRLEEEEQDILQRLRENAAHLGDKR 

RDLAHLAAEVEGKCLQSGFEMLKVRPLPLHSPS 

G 


3524 


A 


3 


698 


PMVRHEAGEALGAIGDPEVLEELKQYSSDPV1EV 

AETCQLAVRRLEWLQQHGGEPAAGPYLSVDPAP 

PAEER\DVGRLREALLDESRPLFERYRAMFALRN 

AGGEEAALALAEGLHCGSALFRHEVGYVLGQLQ 

HEAAVPQLAAALARCTENPMVRHECAEALGA1A 

RPACLAALQAHADDPERVVRE\SCKVALDMYEH 

ETGRAFQYADGLEQLRGAPSLGPNPHPELPEDS 


3525 


A 


1452 


694 


EGLQRPEYLVASAAGFQGLAWGGEGRGRAGCS 
SSGFRDAEPLLLSCPGRNEPLKKERLKWKSDYP 
MTDGQLRSKRDEFAVDTAPAFEGRKEI WDALKA 
AAYAAEANDHELAQAILDGASITLPHGTLCECY 
DELGNRYQLPIYCLSPPVNLLLEHTEEESLEPPEP 
PPSVRREFPLKVRLSTGKDVRLSASLPDTVGQLK 
RQLHAQE/GTPKPS WQRWFFSGKLLTDRTRLQET 
KIQKDFVIQVIINQPPPPQD 


3526 


A 


123 


3441 


PGNEGLGLAADHNEDLGHLSADAPWPAVTMAP 

RKJISHHGLGFLCCFGGSDIPEINLRDNHPLQFME 

FSSPIPNAEELNIRFAELVDELDLTDKNREAMFAL 

PPEKKWQIYCSKKKEQEDPNKLATSWPDYYIDRJ 

NSMAAMQSLYAFDEEETEMRNQVVEDLKTALR 

TQPMRFVTRFIELEGLTCLLNFLRSMDHATCESRI 

HTSLIGCIIALMNNSQGRAHVLAQPEAISTIAQSL 

RTENSKTKVAVLEELGAVCLVPGGHKKVLQAML 

HYQV YAAERTRFQTLLNELDRSLGRYRDEVNLK 

TAMSFINAVLNAGAGEDNLEFRLHLRYEFLMLG 

IQPVroKiRQHENAILDKHLDFFEMVRNEDDLEL 

ARRFDMVHIDTKSASQMFELIHI<XLKYTEAYPC 

LLSVLHHCLQMPYKRNGGYFQQWQLLDRILQQI 

VLQDERGVDPDLAPLENFNVKNIVNMLINENEV 

KQWRDQAEKFRKEHMELVSRLERKERECETKTL 

EKEEMMRT\LNKMKDKLARESQELRQARGQVA 

.ELVAQLSELSTGPVSSPPPPGGPLTLSSSMTTNDL 

PPPPPPLPFACCPPPPPPPLPPGGPPTPPGAPPCLG * 

MGLPLPQDPYPSSDVPLRICKRVPQPSHPLKSFNW 

VKLNEERVPGTVWNEIDDMQVFRILDLEDFEKM 

FSAYQRHQELITNPSQQKELGSTEDIYLASRKVK 

ELSVIDGRRAQNCIILLSKLKLSNEEIRQAILKMD 
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SEQU) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 

-location 
corresponding 
to last amino 
acid residue of 
peptide 

'sequence 


Amino acid sequence (A=Alanine OCystcinc, D=Aspartic Acid, 
I>=Glutamic Acid, F«=Phenyl alanine, G=Glycine, H=Histidine, 
I = IsoIeucinc, K^Lysine, L— Leucine, M = Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^Serine, 
T=Threoninc, V«Valine, W=Tryptophan, Y^Tyrosine, 
X«Un known, *=Stop codon, A=possibIe nucleotide deletion, 
V=possible nucleotide insertion 










EQEDLAKDMLEQLLKFPEKSDroLLEEHKHEIER 

MARADRFLYEMSRIDHYQQRLQALFFKKKFQER 

LAEAKPKVEAILLASRELVRSKRLRQMLEVILAI 

GNFMI^GQRGGAYGFRVASLNKIADTKSSIDRN 

ISLLHYL1MILEKHFPDILNMPSELQHLPEAAKVN 

LAELEKEVGNLRRGLRAVEVELEYQRRQVREPS 

DKFVPVMSDFITVSSFSFSELEDQLNEARDICFAK 

ALMHFGEHDSKMQPDEFFGIFDTFLQAFSEARQD 

LEAMRRRKEEEERRARMEAMLICEQRERERWQR 

QRKVLAAGSSLEEGGEFDDLVSALRSGEVFDKD 

LCKLKRSRKRSGSQALEVTRERAINRLNY 


3527 


A 


1445 


714 


LLGTRMLAGQLEARDPKEGTHPEDPCPGAGAV 

MEKTAVAAEVLTEDCNTGEMPPLQQQIIRLHQE 

LGRQKSLWADVHGKLRSHIDALREQNMELREKL 

RALQLQRWKARKKSAASPHAGQESHTLALEPAF 

GKISPLSADEETIPKYAGHKN\QSGHSSWGQRSSS 

NNSAPPKPMSLKIERISSWKTPPQENRDKNLSRR 

RQDRRATPTGRPTPCAERRGWSEDGKVASDTCV 

TLHWPLGKFRFR 


3528 


A 


484 


1777 


RISKIQVYYSTGYSSRKMNPTLGLAIFLAVLLTVK 

GLLKPSFSPRNYKALSEVQGWKQRMAAKELAR 

QNMDLGFKLLKKLAFYNPGRNIFLSPLSISTAFS 

MLCLGAQDSTLDEIKQGFNFRKMPEKDLHEGFH 

YIIHELTQKTQDLKLSIGNTLFIDQRLQPQRKFLE 

DAKNFYSAETILTNFQNLEMAQKQINDFI/ESKTH 

GKrNNLIENIDPGWMLLANYIFFRARWKHEFDP . 

NVTKEEDFFLEKNSSVKVPMMFRSGIYQVGYDD 

KLSCTILEIPYQKMTAIFILPDEGKLKHLEKGLQV 

DTFSRWKTLLSRRVVDVSVPRLHMTGTFDLKKT 

LSYIGVSKIFEEHGDLTKIAPHRSLKVGEAVNKA . 

ELKMDERGTEGAAGTGAQTLPMETPLVVKIDKP 

YLLLIYSEKIPSVLFLGKJVNPIGK 


3529 


A 


1 . 


5684 


VSSVSHENPTEVFEDGENPPSSRSSESGFTEFIQY 

QADRTDDEDRELSEGQGAAAIPIGSTSSETETAST 

VGSEETIIQTPSVVTQGTATRSRKTAQKTAMQCC 

LEYVQQFLTRLINLYEQNNSFSQSLATEHQGDLG 

REQGETSKWDRNSQGDVKEKN1SKQKTSKEYLS 

AFLAACQLFLECSSFPVYIAEGNHTSELRSEKLET 

DCEHVQPPQWLQTLMNACSQASDFSVQSVAISL 

VMDLVGLTQSVAMVTGENINSVEPAQPLSPNQG 

RVAVV1RPPLTQGNLRYIAEKTEFFKHVALTLWD 

QLGDGTPQHHQKSVELFYQLHNLVPSSSICEDVI 

SQQLTHKI)KX1RMEAHAKPAVLWHLTRDLHINK 

SSSFVRSFDRSLFIMLDSLNSLDGSTSSVGQAWL 

NQVLQRHDIARVLEPLLLLLLHPKTQRVSVQRV 

QAERYWNKSPCYPGEESDKHFMQNFACSNVSQ 

VQLITSKGNGEKPLTMDEIENFSLTVNPLSDRLSL 

LSTSSETIPMVVSDFDLPDQQIEILQSSDSGCSQSS 

AGDNLS YEVDPET VNAQEDSQMPKE S SPDDD V Q 

QWFDLICKWSGLEVESASVTSQLEIEAMPPKC 

SDIDPDEETIKJEDDSIQQSQNALLSNESSQFLSVS 

AEGGHECVANGISRNSSSPCISGTTHTLHDSSVAS 

TETKSRQRSHSSIQFSFKEKLSEKVSEKETIVKESG 

KQPGAKPKVKLARKKDDDKKKSSNEKLKQTSV 

FFSDGLDLENWYSCGEGDISEIESDMGSPGSRKSP 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
. location - 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide . 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteint, D=Aspartic Acid, 
£=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I«Isoleucine, K=Lysine, L=Le urine, M^Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R<=Arginine, S=Serine, 
T«Threonine, V«=Valine, W*=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V*possibIe nucleotide insertion 










NFNIHPLYQHVLLYLQLYDSSRTLYAFSAIKAILK 

TWIAFWAISTTSVNNAYTPQLSLLQNLLARHRI 

SVMGKDFYSHIPVDSNHNFRSSMYEEILISLCLYY 

MRSHYPTHVK\nTAQDLIGNIWMQMMS 

FTELAKVIESSAKGFPSFISDMLSKCKVQKV1LHC 

LLSSIFSAQKWHSEKMAGKNLVAVEEGFSEDSLI 

NFSEDEFDNGSTLQSQLLKVLQRLrV\LEHRVM\T 

IPEE\NETGFDFWS\DLEHISPHQPMTSLQYLHAQ 

SITCQGMFLCAVIRA\LHQHCACKMHPQWIGLIT 

STLPYMGKVLQRVW SV TLQLCRNLDNLIQQYK 

YETGLSDSRPLWMASIIPPDMILTLLEGITAIIHYC 

LD3PTTQYHQLLVSVDQKHLFEARSGILSILHMI 

MSSVTLLWSILHQADSSEKMTIAASASLTTINLG 

ATKNLRQQ]LELLGPISMNHGVHFMAAL\FVWN 

ERRQNKTTTRTKVIPAASEEQLLLVELVRSISVM 

RAETVIQTVKEVLKQPPAIAKDKKHLSLEVCML 

QFFYAY1QRIPVPNLVDSWASLLILLKDSIQLSLP 

APGQFL1LGVLNEFIMKOTSLENKKDQRDLQDVT 

HKIVDAIG AIAG SSLEQTTWLRRNLEVKPSPKIM 

VDGTOLESDVEDMLSPAMEfANITPSVYSVHAL 

TLLSEVLAHLLDMVFYSDEKERVIPLLVNIMHYV 

VPYLRNHSAHNAPSYRACVQLLSSLSGYQYTRR 

AWKKJEAFDLFMDPSFFQMDASCVNHWRAIMDN 

LMTHDKTTFRDLMTRVAVAQSSSLNLJANRDVE 

LEQRAMLLKRLAFAIFSSEIDQYQKYLPDIQERLV 

ESLRLPQVPTLHSQVFLFFRVLLLRMSPQHLTSL 

WPTMITELVQVFLLMEQELTADEDISRTSGPSVA 

GLETTYTGGNGFSTSYNSQRWLNLYLSACKFLD 

LALALPSENLPQFQMYRWAFIPEASDDSGLEVRR 

QGIHQREFKPYVVRLAKLLRKRAKKNPEEDNSG 

RTLGWEPGHLLLTICTVRSMEQLLPFFNVLSQVF 

NSKWSRCGGHSGSPILYSNAFPNKDMKLENHKP 

CSSKARQKIEEMVEKDFLEGMIKT 


3530 .. 


A \ 


1 


5684 


VSSVSHENPTEVFEDGENPPSSRSSESGFTEFIQY 

QADRTDDIDRELSEGQGAAAIPIGSTSSETETAST 

VGSEETIIQTPSVVTQGTATRSRKTAQKTAMQCC 

LEYVQQFLTRLINLYTIQNNSFSQSLATEHQGDLG 

REQGETSKWDRNSQGDVKEKNISKQKTSKEYLS 

AFLAACQLFLECSSFPVYIAEGNHTSELRSEKLET 

DCEHVQPPQWLQTLMNACSQASDFSVQSVAISL 

VMDLVGLTQSVAMVTGENINSVEPAQPLSPNQG 

RVAWIRPPLTQGNLRY1AEKTEFFKHVALTLWD 

QLGDGTPQHHQKS VELFYQLHNLVPSSSICEDVI 

SQQLTHKDKKIRMEAHAKFAVLWHLTRDLfflNK 

SSSFVRSFDRSLFIMLDSLNSLDGSTSSVGQAWL 

NQVLQRHDIARVLEPLLLLLLHPKTQRVSVQRV 

QAERYWNKSPCYPGEESDKHFMQNFACSNVSQ . 

VQLITSKGNGEI<J>LTMDEIENFSLTVNPLSDRLSL 

LSTSSETIPMYVSDFDLPDQQIEILQSSDSGCSQSS 

AGDNLSYEVDPETVNAQEDSQMPKESSPDDDVQ 

QVVFDLICKWSGLEVESASVTSQLEIEAMPPKC 

SDIDPDEETIKIEDDSIQQSQNALLSNESSQFLSVS 

AEGGHECVANGISRNSSSPCISGTTHTLflDSSVAS 

IETKSRQRSHSSIQFSFKEKLSEKVSEKETIVKESG 

KQPGAKPKVKLARKKDDDKKKSSNEKLKQTSV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

loriifinn 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
.location 
corresponding • 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AInnine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G*=Glycinc, H=Histidine, 
I=Tsoleucine, K=Lysine, L=Leucine, M=Mcthionint, 

j\~Asna rapine P=PrnJinc 0=frlutnminf* R=A ririnin/* C-C*i-m» 

T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *~Stop codon, possible nucleotide deletion, 
^possible nucleotide insertion 










FFSDGLDLENWYSCGEGDISEIESDMGSPGSRKSP 
>n^IHPLYQHVLLYLQLYDSSRTLYAFSAIKAILK 
TNPIAFVNAISTTSVNNAYTPQLSLLQNLLARHRI 
SVMGKDFYSHIPVDSNHNFRSSMYIEILISLCLYY 
MRSHYPTHVKVTAQDLIGNRNMQMMSIEILTLL 
FTELAKVIESSAKGFPSFISDMLSKCKVQKVILHC 
LLSSIFSAQKWHSEKMAGKNLVAVEEGFSEDSLI 
NFSEDEFDNGS1LQSQLLKVLQRLIV\LEHRVM\T 
IPEEVNETGFDFWSVDLEHISPHQPMTSLQYLHAQ 
SITCQGMFLCAVIRAVLHQHCACKMHPQWIGLIT 
STLPYMGKVLQRVVVSVTLQLCRNLDNLIQQYK . 
YETGLSDSRPLWMASIIPPDMILTLLEGITAITHYC 
LLDPTTQYHQLLVSVDQKHLFEARSGILSILHMI 
MSSVTLLWSILHQADSSEKMTIAASASLTTINLG 
ATKNLRC^ILELLGPISMNHGVHFMAAIAFVWN 
ERRQNKTTTRTKVIPAASEEQLLLVELVRSISVM 
RAETVIQTVKEVLKQPPAIAKDKKHLSLEVCML 
QFFYAYIQRIPVP>?LVDSWASLLILLKDSIQLSLP 
APGQFLE-GVLNEFIMKNPSLENKKDQRDLQDVT 
HKWDAIGAIAGSSLEQTTWLRRNLEVKPSPKIM 
VDGTNLESDVEDMLSPAMETANITPSVYSVHAL 
TLLSE\nLAHLLDMVFYSDEKERVIPLLVNIMHYY 
VPYLRNHSAHNAPSYRACVQLLSSLSGYQYTRR 
AWKKEAFDLFMDPSFFQMDASCVNHWRAIMDN 
LMTHDKTTFRDLMTRVAVAQSSSLNLFANRDVE 
LEQRAMLLKRLAFAIFSSEIDQYQKYLPDIQERLV 
ESLRLPQVPTLHSQVFLFFRVLLLRMSPQHLTSL 
WPTMITELVQ VFLLMEQELTADEDISRTSGPS V A 
GLETTYTGGNGFSTSYNSQRWLNLYLSACKFLD 
LALALPSENLPOFOMYRWAFIPEASDDSGLEVRR : 
QGIHQREFKPYVVRLAKLLRKRAKKNPEEDNSG 
. RTLGWEPGHLLLTICTVRSMEQLLPFFNVLSQVF 
NSKVTSRCGGHSGSPILYSNAFPNKDMKLENHKP 
CSSKARQKIEEMVEKDFLEGMIKT 


3531 


A 


553. . 


2470 


LISPSPALSSQDPALSLKENLEDISGWGLPEARSK 

ESVSFKDVAVDFTQEEWGQLDSPQRALYRDVM 

LENYQNLLALGPPLHKPDVISHLERGEEPWSMQ 

REVPRGPCPEWELKAVPSQQQGICKEEPAQEPIM 

ERPLGGAQAWGRQAGALQRSQAAP\GR\RTCHG 

LGRPWEEFPLRCPLFAQQRVPEGGPLLDTRKNV 

QATEGRTKAPARLCAGENASTPSEPEKFPQVRRQ 

RGAGAGEGEFVCGECGKAFRQSSSLTLHRRWHS 

REKAYKCDECGKAFTWSTNLLEHRRIHTGEKPFF 

CGECGKAFSCHSSLNVHQRIHTGERPYKCSACEK 

AFSGSSLLSMHLRVHTGEKPYRCGECGKAFNQR 

THLTRHHR1HTGEKPYQCGSCGKAFTCHSSLTVH 

EKIHSGDKPFKCSDCEKAFNSRSRLTLHQRTHTG 

EKPFKCADCGKGFSCHAYLLVHRRIHSGEKPFKC 

NECGKAFSSHAYLTVHT^RTHTGEKJ^FDCSOCWKA 

FSCHSSLTVHQRIHTGEKPYKCSECGRAFSQNHCL 

IKHQKmSGEKSFKCEKCGEMFNWSSHLTEHQRL 

HSEGKPIJUQFNKHLLSTYYVPGSLLGAGDAGLR 

DVDP1I)ALDVAKLLCVVPPRAGRNFSLGSKPRN 


3532 


A 


3931 


317 


HMLQDSPSAEPPAGSIVITLRHWGMlARGSKPVGD 
GAQPMAAMGGLKVLLHWAGPGGGEPWVTFSES 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F-Phenylalanine, G=GIycinc, H=Histidine, 
I^Isoleucine, K-Lysine, L=Leucine, M-Methionine, 
N— AsDarainne* P=Prnline O^CJIiifrnminp Ps=A roininp o — Cacina 

^»JJJt»* <>g<"W) 3. A, a Ullllkj ^~VJI ULAllllllCj x\ r\J XLllllIICf O^OC* 1 11C» 

T«Threonine, V=Valine, W*=Tryptophan, Y=Tyrosine, 
X«Un known, *=Stop codon, /^possible nucleotide deletion, 
V= possible nucleotide insertion 










SLTAEEVCIHIAHKVGITPPGFNLFALFDAQAQV 

WLPPNHILEIPRDASLMLYF\RHRFYSR\NWHGM 

NPREPAVYRCGPPGTEASSDQTAQGMQLLDPAS 

FEYLFEQGKHEFVNDVASLWELSTEEEIHHFKNE 

SLGMAFLHLCHLALRHGIPLEEVABCKTSFKDCIP 

RSFRRHIRQHSALTRLRLRNVFRRFLRDFQPGRLS 

QQMVMVKYLATLERL APRFGTERVPVCHLRLLA 

QAEGEPCYIRDSGVAPTDPGPESAAGPPTHEVLV 

TGTGGIQWWPVEEEVNKEEGSSGSSGKNPQASL 

FGKKAKAHKAFG QPADRPREPLG A YFCDFRDIT 

HVGLKEHCVSIHRQbNKCLELSLPSRAAALSFVS 

LVDGYFRLTADSSHYLCHEVAPPRLVMSIRDGIH . 

GPLLEPFVQAKLRPEDGLYLIHWSTSHPYRLILTV 

AQRSQAPDGMQSLRLRKFPIEQQDGAF VLEG WG 

RSFPSVRELGAALQGCLLRAGDDCFSLRRCCLPQ 

PGETSNLIIMRGARASPRTLNLSQLSFHRVDQKEI 

TQLSHLGQGTRTKVYEGRLRVEGSGDPEEGKMD 

DEDPLVPGRDRGQELRVVLKVLDPSHHDIALAF 

YETASLMSQVSHTHLAFVHGVCVRGPENIMVTE 

YVEHGPLDV WLRRERGHVPMAWKMVVAQQLA 

SALSYLENKNLVHGNVCGRNILLARLGLAEGTSP 

FIKLSDPGVGLGALSREERVERIPWLAPECLPGG 

ANSLSTAMDKWGFGATLLEICFDGEAPLQSRSPS 

EKEHFYQRQHRLPEPSCPQLATLTSQCLTYEPTQ 

RPSFRTELRDLTRLQPHNL ADVLTVNPDSP ASDPT 

VFHKRYLKKTRDLGEGHFGKVSLYCYDPTKDGT 

GEMVAVKALKADCGPQHRSGWKQEIDILRTLYH 

EHIIKYKGCCEDQGEKSLQLVMEYVPLGSLRDYL 

PRHSIGLAQLLLFAQQICEGMAYLHAQHYIHRDL 

AARNVLLDNDRL VKIGDFGLAKA VPEGHF Y YR V 

REDGDSPVFWYAPECLKEYKFYYASDVWSFGVT 

LYELLTHCDSSQSPPTXFLELIGIAQGQMTVLRLT 

ELLERGERLPRPDKCPCEVYHLMKNCWETEASF 

RPTFENLIPBLKTVHEKYQGQAPSVFSVC 


3533 


A 


182 . 


3465 


FRWLDFFRGSINSQFEFGRKKENMTSPAKFKKDK 

EIIAEYDTQVKEIRAQLTEQMKCLDQQCELRVQL 

LQDLQDFFRKKAEBEMDYSRNLEKLAERFLAKT 

RSTKDQQFKKDQNVLSPVNCWNLLLNQVKRES 

RDHTTLSDIYLNNIIPRFVQVSEDSGRLFKKSKEV 

GQQLQDDLMKVLNELYSVMKTYHMYNADSISA 

QSKLKEAEKQEEKQIGKSVKQEDRQTPRSPDSTA 

NVRIEEKHVRRSSVKKIEKMKEKRQAKYTENKL 

KADCARNEYLLALEATNASVFKYYIHDLSDLIDQ 

CCDLGYHASLNRALRTFLSAELNLEQSKHEGLD 

AIENAVENLDATSDKQRLMEMYNNVFCPPMKFE 

FQPHMGDMASQLCAQQPVQSELLQRCLQLQSRL 

STLKIENEEVKKTMEATLQTIQDIVTVEDFDVSD 

CFQYSNSMESVKSTVSETFMSKPSIAKRRANQQE 

TEQFYFTKMKEYLEGRNLITKLQAKHDLLQKTL 

GESQRTDCSLARRSSTVRKQDSSQAEPLWESCIR 

FISRHGLQHEGIFRVSGSQVEVNDIKNAFERGEDP 

LAGDQNDHDMDSIAGVLKLYFRGLEHPLFPKDIF 

HDLMACVTMDNLQERALHIRKVLLVLPKTTLII 

MRYLFAFLNHLSQFSEENMN4DPYNLAICFGPSL 

MSVPEGHDQVSCQAHVNELDCTI1IQHENIFPSPRE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of - 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of . 
peptide 
sequence 


Amino acid sequence (A~Alanine C=Cystcine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
t-lsoleuciac, K-Lysine,L=Leucine, M»Mettiionine, 
N=Asparagine, P=Proline, Q^Glutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possibie nucleotide deletion, 
V=posstble nucleotide insertion 










LEGPVYSRGGSMEDYCDSPHGETTSVEDSTQDV 

TAEHHTSDDECEPDEAIAKFDYVGRTARELSFKK 

GASLLLYQRASDDWWEGRHNGEDGLIPHQYIVV 

QDTEDGVVERSSPKSEffiVISEPPEEKVTARAGAS 

CPSGGHVADIYLANINKQRKRPESGSIRKTFRSDS 

HGLSSSLTDSSSPGVGASCRPSSQPIMSQSLPKEG 

PDKCSISGHGSLNSISRHSSLKNRLDSPQIRKTAT 

AGRSKSFDNHRPMDPEVIAQDIEATMNSALNELR 

ELERQSSVKHTPDVVLDTLEPLKTSPVVAPTSEPS 

SPLHTQLLKDPEPAFQRSASTAGDIACAFRPVKS 

VKMAAPVKPPAT\RPKPT\VFPKTNATSPGVNSST 

SPQSTDKSGTV - ' 


3534 


A 


1 


2640 


FRRFVCPASRRPAAGLRDAASSAPRGMASEGPRE 

PESEGIKLSADVKPFVPRFAGLNVAWLESSEACV 

FPSSAATYYPFVQEPPVTEQKIYTEDMAFGASTFP 

PQYLSSEITLHPYAYSPYTLDSTQNVYSVPGSQY 

LYNQPSCYRGFQTVKHR3SIENTCPLPQEMKALFK 

BCKTYDEKKTYDQQKFDSERADGTISSEIKSARGS 

HHLSIYAENSLKSDGYHKRTDRKSRIIAKNVSTS 

KPEFEFTTLDFPELQG AENNMSEIQKQPK WGP VH 

SVSTDISLLREVVKPAAVLSKGEIWKNNPNESV 

TANAATNSPSCTRELSWTPMGYWRQTLSTELS 

AAPKNVTSMINLKTIASSADPKNVSIPSSEALSSD . 

PSYNKEKHIIHPTQKSKASQGSDLEQNEASRKNK 

KKKEKSTSKYEVLTVQEPPRIEDAEEFPNLAVAS 

ERRDRIETPKFQSKQQPQDNFKNNVKKSQLPVQL 

DLGGMLTALEKKQHSQHAKQSSKPVWSVGAV 

PVLSKECASGERGRRMSQMKTPHNPLDSSAPLM 

KKGKQREIPKAKKPTSLKKIILKERQERKQRLQE 

NAVSPAFTSDDTQDGESGGDDQFPEQAELSGPEG 

MDELISTPSVEDKSEEPPGTELQRDTEASHLAPN 

HTTFPKIHSRRFRDYCSQMLSKEVDACVTDLLKE 

LVRFQDRMYQKDPVKAKTKRRLVLGLREVLKH 

LKLKKLKCVHSPNCEKIQSKGGLDDTLHTIIDYA 

CEQNIPFWALNRKALGRSLNKAVPVSVVGIFSY 

DGAQDQFHKMVELTVAARQAYKTMLENVQQE 

LVGEP\SLRHLPAYPHRAPAALQKMAPQPA^KEK 

EEP1WIEIWKKHLEAYSGCTLELEESLEASTSQM 

MNLNL 


3535 


A 


1747 


983 


LFQFQVCRSVLSPRAAGCTWSLAPRSRGAAGSPR 

RYRGPQPQPAPPSALPNSRPSPVASGREMVVLSV . 

PAEVTVILLPIEGTTTP1AFVKDILFPYIEENVKEY 

LQTHX^EEECQQDVSLLRKQV^ 

REAGMKVYIYSSGSVEAQKLLFGHSTEGDILELV 

DGHFDTKIGl^VESESYRKIADSIGCSTNNILFLT 

DVTREASAAEEADVHVAVWRPGNAGLTDDEK 

TYYSLITSFSELYLPSST 


3536 . 


A 


3 


1302 


GRPPTAPHTGRPPTANRGDPRLDLKRGCARLLTS 
IESRGRPAASAGLRRDRCALRRWPLRRAPLARAT 
RREAGSPRRCAPRPRACPQGWSBARHQPGGLCL 

YKTELSB^ECCSTGRLSTSWTEEDVNDNTLFKW 
MIFNG G APNCEPCKETCENVD CG PGKKC RMNKK 
NKPRCVCAPDCSNITWKGPVCGLDGKTYRNECA 
LLKARCKEQPELEVQ YQGRCKKTCRDVFCPGS S 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E^Glutamic Acid, ^Phenylalanine, OGlycine, H=Histidine, 
. t-Isoleucine, K=Lysine, L=Leucine, M«Methiomne, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y«Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










TCVXVDQTNNAYCVTCNRICPEPASSEQYLCGND 
GVTYS\SACHLRKATCLLGRSIGLAYEGKCIKAK 
SCEDIQCTGGKKCLWDFKVGRGRCSLCDELCPD 
SKSDEPVCASDNATYASECAMKEAACSSGVLLE 
VKHSGSCNSISEDTEEEEEDEDQDYSFPISSILEW 


3537 


A 


285 


2123 


IGLFLQVAPLSVMAKSCPSVCRCDAGFIYCNDRF 
LTSIPTGIPEDATTLYLQ^QINNAGIPSDLKNLL 
KVERIYLYHNSLDEFPT^PKYVKELHLQENNIR 
TITYDSLSKIPYLEELHLDDNSVSAVSIEEGAFRD . 
SNYLRLLFLSRNHLSTIPWGLPRTIEELRLDDNRIS 
TISSPSLQGLTSLKRLVLDGNLLNNHGLGDKVFF 
NLVNLTELSLVRNSLTAAPVNLPGTNLRKLYLQ 
DNHINRVPPNAFSYLRQLYRLDMSNNNLSNLPQ . 
GIFDDLDNITQLILRNNPWYCGCKMKWVRDWL 
QSLPVKVNVRGLMCQAPEKVRGMAIKDLNAELF 
DCKDSGIVSTIQITTAIPNTVYPAQGQWPAPVTK 
QPDIKNPKLTKDHQTTGSPSRKTITITVKSVTSDTI 
HISWKLALPMTALRLSWLKLGHSPAFGSITETIVT 
GERSEYLVTALEPDSPYKVCMVPMETSNLYLFD 
ETPVCIETETAPLRMYNPTTTLNREQEKEPYKNP 
NLPLAAIIGGAVALVTLALLALVCWYVHRNGSLF 
SRNCAYSKGRRRKDDYAEAGTKKDNSILEIRETS 
: FQMLPISNEPISKEEFVIrn , lFPPNGMNLYK>JNH 


3538 


A 


877. 


6184 


WNVKPSLLVVQLFKFSDKEEHEQNDSISGKTGET 

GVEEMIATRKVEQDSKETVKLSHEDDHTLEDAGS 

SDISSDAACTNPNKTENSLVGLPSCVDEVTECNL 

ELKDTMGIADKTENTLERNKIEPLGYCEDAESNR 

QLESTEFNKSNLEVVDTSTFGPESNILENAICDVP 

DQNSKQLNAIESTKDESHETANLQDDRNSQSSSV . 

SYLESKSVKSKHTKPVIHSKQNMTTDAPKKIVAA 

KYEVIHSKTKVNVKSVKRNIDVPESQQNFHRPV 

KVRKKQIDKEPKIQSCNSGVKSVKNQAHSVLKK 

TLQDQTLVQIFKPLTHSLSDKSHAHPGCLKEPHH 

PAQTGHVSHSSQKQCHKPQQQAPAMKTNSHVK 

EELEHPGVEHFKEEDKLKLKKPEKNLQPRQRRSS 

KSFSLDEPPLFIPDNIATIRREGSDHSSSFESKYMW 

TPSKQCGFCKKPHGNRFMVGCGRCDDWFHGDC 

VGLSLSQAQQMGEEDKEYVCVKCCAEEDKKTEI 

LDPDTLENQATVEFHSGDKTMECEKLGLSKHTT 

NDRTKYIDDTVKHKVKDLKRESGEGRNSSDCRD 

NEIKKWQLAPLRKMGQPVLPRRSSEEKSEKIPKE 

STTVTCTGEKASKPGTHEKQEMKKKKVXEKGVL 

NVHPAASASICPSADQIRQSVRHSLKDILMKRLTD 

SNLKVPEEKAAKVATKIEKELFSFFRDTDAKYKN 

KYRSLMFNLKDPKKNILFKKVLKGEVTPDHLIR 

MSPEELASKELAAWRRRENRHTEEMIEKEQREVE 

RRPITKITHKGEIEIESDAPMKEQEAAMEIQEPAA 

NKSLEKPEG SEK\RKEEVDSMSKDTTSQHRQHLF 

DLNCIGCIGRMAPPVDDLSPKKVKWVGVARKH 

SDNEAESIADALSSTSNELASEFFEEEKQESPKSTF 

or AriSJ'nJVLru 1 v r. V Jbo 1 rLrAKJLJNr 1 WiVUr AIM Mr o 

VAKFVTKAYPVSGSPEYLTEDLPDSIQVGGRISPQ 

TVWDYVEKJKASG'TKJBICVVRFTPVTEEDQISYT 

LLFAYFSSRKRYGVAA2WMKQVKDMYLIPLGAT 

DKIPHPLVPFDGPGLELHRPNLLLGLIIRQKLKRQ 



376 



WO 01/57190 



PCT/US01/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, GKSlycine, H=Histidine, 
I^Isolcucine, K«Lysine, LHLeucine, M=Mcthionine, 
N=Asparagine, P-Proline, Q=Glutaroine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *«=Stop cod on, Accessible nucleotide deletion, 
\Rpossibie nucleotide insertion 


- 


- 






HSACASTSH1AETPESAPPIALPPDKKSKIEVSTEE 
APEEENDFFNSFTTVLHKQRNKPQQNLQEDLPTA 
VEPLMEVTKQEPPKPLRFLPGVLIGWENQPTTLE 
. LANKPLPVDDILQSLLGTTGQVYDQ\AQS VMEQ 
NTVKEEPFLNEQTNSKIEKTDNVEVTDGENKEIK 
VKVDNISESTDKSAEBBTSWGSSSISAGSLTSLSL 
RGKPPDVSTEAFLTM.SIQSKQEETVESKEKTLK& 
QLQEDQENNLQDNQTSNSSPCRSNVGKGNIDGN 
VSCSENLVANTARSPQFINLKRDPRQAAGRSQPV 
TTSESKDGDSCRNGEKHMLPGLSHNKEHLTEQIN 
VEEKLCSAEKNSCVQQSDNLKVAQNSPSVENIQT 
SQAEQAKPLQEDILMQNEETVHPFRRG S A V ATSH 
FEVGNTCPSEFPSKS1TFTSRSTSPRTSTNFSPMRP 
QQPNLQHLKSSPPGFPFPGPPNFPPQSMFGFPPHL 
PPPLLPPPGFG\FA\QNPMVPWPPW\HLP\GQPQR 
MMGPLSQASRYIGPQNFYQVKDIRRPERRHSDP 
WGRQDQQQLDRPFNRGKGDRQRFYSDSHHLKR 
ERHEKEWEQESERHRRRDRSQDKDRDRKSREEG 
HKDKERARLSHGDRGTDGKASRDSRNVDKKPD 
KPKSEDYEKDKEREKSKHREGEKDRDRYHKDR 
DHTDRTKSKR 


3539 


A 


157 


1769 


GSWTVELSLKPSASPSLKWVCLPGAAAVNKHRS 
. GAGGLIRSLIQCTWAPAGPARRGGRGIEDFPYLF 
FQLTHCQQRICSVTQAGVQWCDHSSLQPQTPGL 
NQSSHLSLLSSRDYRMLSSFNEWFWQDRFWLPP 
NVTWTELEDRDGRVYPHPQDLLAALPLALVLLA 
MRLAFERFIGLPLSRWLGVRDQTRRQVKPNATL 
EKHFLTEGHRPKEPQLSLLAAQCGLTLQQTQRW 
FRRRRNQDRPQLTKKFCEASWRFLFYLSSFVGGL 
. SVLYHESWLWAPVMCWDRYPNQLTLSCPAADS 
EA\SLYWWYLLELGFYLSLLIRLPFDVKRKGGGP 
SSIKPRPHYDPPSTA\DFKEQVIHHFVAVILMTFSY 
SANLLRIGSLVLLLHDSSDYLLEACKMVNYMQY 
QQVCDALFLIFSFVFFYTRLVLFPTQILYTTYYESr , 
SNRGPFFGYYFFNGLLMLLQLLHVFWSCLILRML 
YSFMKKGQMEKDIRSDVEESDSSEEAA AAQEPL 
QLKNGTAGGPRPAPTDGPRSRVAGRLTNRHTTA 
T 


3540 


A 


267 


1397 


SPAGYCHSGLLPGCSRSA/CADLAKHQELPGKKL 

LSEKKLKRYFVDYRRVLVCGGNGGAGASCFHSE 

PRKEFGGPDGGDGGNGGHVILRVDQQVKSLSSV 

LSRYQGFSGEDGGSKNCFGRSGAVLYIRVPVGTL 

VKEGGRWADLSCVGDEYIAALGGAGGKGNRF 

FLANNNRAPVTCTPGQPGQQRVLHLELKTVAHA 

GMVGFPNAGKSSLLRAISNARPAVASYPFTTLKP 

HVGIVHYEGHLQIAVADIPGIIRGAHQNRGLGSA 

FLRHIERCRFLLFVVDLSQPEPWTQVDDLKYELE 

MYEKGLSARPHAIVANKJDLPEAQANLSQLRDH 

LGQEVIVLSALTGENLEQLLLHLKVLYDAYAEA 

ELGQGRQPLRW 




A 


i 
i 


5UUS 


DTQ V SETLKilFAGKVTTAS VKERREE-SELGKCV 

AGKDLPEGAVKGLCKLFCLTLHRYRDAASRRAL 

QAAIQQLAEAQPEATAKNLLHSLQSSGIGSKAGV 

PSKSSGSAALLALTWTGLLVRJVFPSRAKRQGDI 

WNKLVEVQCLLLLEVLGGSHKHAVDGAVKICLT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=*AIanine OCysteine, D^Aspnrtic Acid, 
E=Glutamic Acid, F^Phenylalanine, G=GIycine, HHHistidine, 
I=Isolcucine, K=Lysinc, L=Lcucinc, M=Methionine, 
N=Asparagine, P-Proline, Q-Glu famine, R=Argininc, S=Serinc, 
T=Threoninc, V=Valine t W=Tryptophan, V^Tyrosine, 
X=Un known, *=Stop codon, A=possibIe nucleotide deletion, 
V=possible nucleotide insertion 








■ ■ 


klwkenpglveqylsajlslepnqnyagmlgll 

vqfctshkemdvvsqhksalldfymknilmsk 

vkppkylldscapllrylshsefkdlilptiqksl 

lrspenvtetissllasvtldlsqyamdivkglag 

hlksnsprlmdeavlalrnlarqcsdssamesl 

tkhlfailggsegkltvvAqkmsvlsgigsvshh 

wsgpssqvlngivaelfipflqqevhegtlvha 

vs\o.alwcnrftmevpkkltewfkkafslktst 

savrhaylqcmlasyrgdtllqaldllplliqt 

vekaasqstqvptitegvaaallllklsvadsqa 

eaiassfwqlivdekkqvftsekflvmasedal 

ctvlhvlterlfldhphrltgnkvqqyhralva 

v1xsrtwhvrrqaqqtvrkllsslggfklahgl 

leelktvlsshkvlplealvtdagevteagkay 

vpprvlqealcvisgvpglkgdvtdteqlaqem 

liishhpslvavqsglwpallarmkidpeafitrh 

ldqiiprmttqsplnqssmnamgslsvlspdrvl 

pqlistitasvqnpalrlvtreefa1mqtpagely 

dksnqsaqqdsikkanmkrenkaysfkeqiiele 

lkeeikkkkgikeevqltskqkemlqaqldrea . 

qvrrrlqeldgeleaalglldiilaknpsgltqyi 

pvlvdsflpllksplaapr1knpflslaacvmpsr 

lkalgtlvshvtlrllkpecvldkswcqeelsv 

avkravmllhthtitsrvgkgepgaaplsapafs 

lvfpflkmvltemphhseeeeewmaqilqiltvq 

aqlraspntppgrvdengpellprvamlrlltw 

vigtgsprlqvlasdtlttlcasssgddgcafae 

qeevdvllcalqspcasvretvlrglmelhmvl 

PAPDTDEKNGLNLLRRLWVVKFDKEEEIRKLAE 

RLWSMMGLDLQPDLCSLLIDDVIYHEAAVRQAG 

AEALSQAVARYQRQAAEVMGRLMEIYQEKLYR 

PPPVLDALGRVISESPPDQWEARCGLALALNKLS 

QYLDSSQVKPLFQFFVPDALNDRHPDVRKCMLD 

AALATLNTHGKENVNSLLPVFEEFLKNAPNDAS 

YDAVRQSVVVLMGSLAKHLDKSDPKVKPIVAKL 

IAALSTPSQQVQESVASCLPPLVPAIKEDAGGMIQ 

RLMQQLLESDKYAERKGAAYGLAGLVKGLGILS 

LKQQEMMAALTDAJQDKKNFRRREGALFAFEM 

LCimGKLFEPYVVHVLPHLLLCFGDGNQYVRE 

AADDCAKAVMSNLSAHGVKLVLPSLLAALEEES 

WRTKAGSVELLGAMAYCAPKQLSSCLPNIVPKL 

TEVLTDSHVKVQKAGQQALRQIGSVIRNPEILAI 

APVLLDALTOPSRKTQKCLQTLLDTKFVHFIDAP 

SLALIMPIVQRAFQDRSTDTRKMAAQIIGNMYSL 

TDQKDLAPYLPSVTPGLKASLLDPVPEVRTVSAK 

ALGAMVKGMGESCFEDLLPWLMETLTYEQSSV 

DRSGAAQGLAEVMAGLGVEKLEKLMPEIVATAS 

KVDIAPHVRDGYIMMFNYLPITFGDKFTPYVGPII 

PCELKALADENEFVRDTALRAGQRVISMYAETAI 

ALLLPQLEQGLFDDLWRIRFSSVQLLGDLLFfflSG 

V ruKJVL 1Tb 1 ASEDDNFGTAQSNKAIITALGVERR 

NRVLAGLYMGRSDTQLVVRQASLHVWKIVVSN 

TPRTLREELPTLFGLLLGFLASTCADKRTIAARTL 

GDLVRBXGEKTLPEIIPILEEGLRSQKSDERQGVCI 

GLSEMKSTSRDAVLYFSESLVPTARKALCDPLE 
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SEQD> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Pbenylalanine, G=Glycine, H=Histidine, 
I-Isoleucinc, K=Ly'sine, L«Lcucinc ( M=Mcthionine, 
N=Asparagine, P=ProIine, Q=Glutaminc,R=Arginine, S=Scrine, 
T«Th re o n i n c, V=Val i ne, W«Try p to pha n f Y-Ty rosi ne, 
X-Un known, *=Stop cod on, /^possible nucleotide deletion, 
^possible nucleotide insertion 










EVREAAAKTFEQLHSTIGHQALEDILPFLLKQLD 

DEEVSEFALDGLKQVMADCSRVVLPYLVPKLTTP 

PVNTRVLAFLSSVAGDALTRHLGVILPAVMLAL 

KEKLGTPDEQLEMANCQAVILSVEDDTGHRIIIE 

DLLEATRSPEVGMRQAAAIELNIYCSRSKADYTS 

HLRSLVSGLHILFNDSSPVVLEESWDALNAITKK 

LDAGNQLALIEELHKEIRLIGNESKGEHVPGFCLP 

KKGVTSILPVLREGVLTGSPEQKEEAAKALGLVI 

RLTSADALRPSWSITGPLIRILGDRFSWNVKAAL 

LETLSLLLAKVGIALKPFLPQLQTTFTKALQDSNR 

GVRLKAADALGKLISIHIKVDPLFTELLNGIRAME 

DPGVRDTMLQALRFVIQGAGAKVDAVIRKNIVS 

LLLSMLGHDEDNTRISSAGCLGELCAFLTEEELS 

AVLQQCLLADVSGIDWMVRHGRSLALSVAVNV 

APGRLCAGRYSSDVQEMELSSATADRIPIAVSGV 

RGMGFLMRHHIETGGGQLPAKLSSLFVKCLQNP 

SSDIRLVAEKMIWWANKDPLPPLDPQAIKPILKA 

LLDNTKDKNTVVRAYSDQAIVNLLKMRQGEEVF 

QSLSKILDVASLEVLNEVNRRSLBCKLASQADSTE 

QVDDULT 


3542 

- * 


A 


62 


1130 


PWNPQDFPGNRGLMG\QKGEIGPP\GQQGKKGAP 

GMP\GLMGSNGSPGQPGTPGSKGSKGEPGIQGMP 

GASGLKGEPGATGSPGEPGYMGLPGIQGKKGDK 

GNQGEKGIQGQKGENGRQGIPGQQGIQGHHGAK 

GERGEKGEPGVRGAIGSKGESGVDGLMGPAGPK 

GQPGDPGPQGPPGLDGKPGREFSEQFIRQVCTDV 

IRAQLPVLLQSGRIRNCDHCLSQHGSPGIPGPPGPI . 

GPEGPRGLPGLPGRDGVPGLVGVPGRPGVRGLK 

GLPGRNGEKGSQGFGYPGEQGPPGPPGPEGPPGI 

SKEGPPGDPGLPGKDGDHGKPGIQGQPGPPGICD 

PSLCFSVIARRDPFRKGPNY 


3543. 


A 


654 


194 


PARSLEKMKASVVLSLLGYLVVPSGAYILGRCTV 

AKKLHDGGLDYFERYSLENWVCLAYFESKFNPS\ 

AIYENTREGYTGFGLFQMRGSDWCGDHGRNRC 

HMSCSALLNPNLEKTIKCAKTIVKGKEGMGAWP 

TWSRYCQYSDTLARWLDGCKL 


3544 


A 


2 


1074 


SCRLAAGRLAQWLLRASRSGMLRAGWLRGAAA 
LALLLAARWAAFEPITVGLAIGAASAITGYLSY 
NDIYCRFAECCREERPLNASALKLDLEEKLFGQH 
LATEWFKALTGFRNNKJ4PKKPLTLSLHGWAGT 
GKNFVSQMGAENLHPKGLKSNFVHLFVSTLHFP 
HEQKIKLYQDQLQKWmGWSACANSVFIFDEM 
DKL\HPGnE\AnCPFLDYYEHVERVSYR\KAIFIFLS 
. NAGGDLITKTALDFWRAGRKREDIQLKDLEPVL 
SVGVFNMCHSGLWHSGLIDKNLIDYFIPFLPLEYR 
H\nCMCVRAEMRARGSAIDEDIVTRVAEEMTFFP\ 
RDEKTYSDKGCKTVQSRLDFH 


3545 


A 


3 


273 


SAQGRSWGRFYRQ1KRHPGIIPM1GLICLGMGSA 

ALYLLRLALRSPDVW*SWDRXNNPEPWNRLSPN 

DQYKFLAVSTDYKKLKKDRPDF 


Jj4o 


A 


23 


591 


ALSTETRTPDMRRLLLVTSLVVVLLWEAGAVPA 
PKVPIKMQVKHWPSEQDPEKAWGARVVEPPEK 
DDQLWLFPVQKPKLLTTEEKPRGQGRGPILPGT 
KAWMETEDTLGRVLSPEPDHDSLYHPPPEEDQG 
EERPRLWVMPNHQVLLGPEEDQDHIYHPQ*GSR 
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SEQID 
NO: 


Method 


Predicted , 
beginning 
nucleoride 
location 
corresponding . 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine OCysteine, D=Aspartic Acid, 
E^lutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, l^Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q^GIutatnine, R=Arginine, S^Serine, 
1 = 1 tireonine, v** Valine, W=Tryptophan, Y=Tyrosinc, 
X«t!n known, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










GHHCPRPVPRPRLLGLGPSLPCPS 


3547 


A 


23 


591 


ALSTETRTPDMKJRLLLVTSLVYVLLWEAGAVPA 
PKVPIKMQVKHWPSEQDPEKLAWGARVVEPPEK 
DDQLVVLFPVQKPKLLTTEEKPRGQGRGPILPGT 
KAWMETEDTLGRVLSPEPDHDSLYHPPPEEDQG 
EERPRLWVMPNHQVLLGPEEDQDHI YHPQ* GSR 
GHHCPRPVPRPRLLGLGPSLPCPS 


3548 


A 


3 


1641 


TWLPSVPAEEVQQPEMAAVLNAERLEVSVDGLT 

LSPDPEERPGAEGAPLAAATAATALATWIRSRPG 

RLRGTARSPGRRAAGGAAEEARRLEQRWGFGLE 

ELYGLALRFFKEKDGKAFHPTYEEKLKLVALHK 

QVLMGPYNPDTCPEVGFFDVLGNDRRREWAAL 

GNMSKEDAMVEFVKLLNRCCHLFSTYVASHKIE 

KEEQEKKRKEEEERRRREEEERERLQKEEEKRRR 

EEEERLRREEEERRRffiEERLRLEQQKQQIMAAL 

NSQTAVQFQQYAAQQYPGNYEQQQILIRQLQEQ 

HYQQYMQQLYQVQLAQQQAALQKQQEV WAG 

SSLPTSSKVECNCTQVI*CQFNRQAKTHTDSSEKE 

LEPEAAEEALENGPKESLPV1AAPSMWTRPQIKD 

FKEK1QQDADSVITVGRGEWTVRVPTHEEGSYL 

FWEFATDNYDIGFGVYFEWTDSI*NTAVSVHVSE 

SSDDDEEEEEOTGCEEKAKKNANKPLLDEIVPVY 

RRDCHEEVYA GSHQYPGRGVYLLKFDNS YSL W 

RSKSVYYRVYYTR 


3549 


A 


1837 


3593 


PAVLVLEPASQSRKQQNTASATAQHWSAQIHKE 

SFLAWFTKDEQKHRRPYEFEVERDAKARGLEQF 

SATHGHTPIILNGWHGESAMDLSCSSEGSPGATS 

PFPVSASTPKIGAISSLQGALGMDLSGILQAGLMP 

VTGQIVNGSLRRDDAATRRRRGRRKHVEGGMD 

LIFLKEQTLQAGILEVHEDPGQATLSTTHPEGPGP 

ATSAPEPATAASSQAEKSIPSKSLLDWLRQQADY 

SLEVPGFGANFSDKPKQRRPRCKEPGKLDVSSLS 

GEERVPAIPKEPGLRGFLPENKFNHTLAEPILRDT 

GPRRRGRRPRSELLKAPSIVADSPSGMGPLFMNG 

LUGMDLVGLQNMRNMPGIPLTGLVGFPAGFAT 

MPTGEEVKSTLSMLPMMLPGMAAVPQMFGVGG • 

LLSPPMATTCTSTAPASLSSTTKSGTAVTEKTAE 

DKPSSHDVKTDTLAEDKPGPGPFSDQSEPAITTSS 

PVAFNPFLPGVSPGLrYPSMFLSPGMGMALPAM 

QQARHSEIVGLESQKRKKKKTKGDNPNSHPEPA 

PSCEREPSGDENCAEPSAPLPAEREHGAQAGEGA 

LKDSNNDTN 


3550 : 


A 


287 


39 


QLNLNKIATSQKHRDFVAESVGEICPVGSLAGIGE 
VMDKKLEEGCFDKAYVVLGQFLVLKKDEDLF*E 
WLRDTGGARTRGSRE 


3551 


A 


21 


3925 .. 


GDLLEVGLPPGLEFPRGICLRGLRRTMSLDFGSV 

ALPVQNEDEEYDEEDYEREKELQQLLTDLPHDM 

LDDDLSSPELQYSDCSEDGTDGQPHHPEQLEMS 

WNEQMLPKSQSVNGPSCQGLEPYNKVTYKPYQS 

SAQNNGSPAQEITGSDTFEGLQQQFLGANENSAE 

NMQ11QLQVLNKAKJERQLENLIEKLK 

LNHQLVHKDEKDGLTLSLRESQKLFQNGKEREIQ 

LEAQIKALETQIQALKVNEEQMIKKSRTTEMALE 

SLKQQLVDLHHSESLQRAREQHESIVMGLTKKY 

EEQVLSLQKNLDATVTALKEQEDICSRLKDHVK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D-Aspartic Acid, 
E=Glutnmic Acid, F=Phenylalanine, G=GIycine, H^Htstidine, 
I=Isoleucine, KpLysine, L=Leucine, M^Methionine, 
N=Asparagine, P=Proline, Q^Glutamine, R=Arginine, S=Serine, 
T=Thrconinc, V=Va1ine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=$top codon, /^possible nucleotide deletion, 
\= possible nucleotide insertion 










QLERNQEAIKLEKTEIINKLTTISLEESQKQCAHLL 

QSGSVQEVAQLQFQLQQAQKAHAMSANMNKA 

LQEELTELKDEISLYESAAKLGIHPSDSEGELNIEL 

TESYVDLGIKKVNWKKSKVTSIVQEEDPNEELSK 

DEFILKLKAEVQRLLGSNSMKRHLVSQLQNDLK 

DCHKKIEDLHQVKKDEKSIEVETKTDTSEKPKNQ 

LWPESSTSDVVRDDILLLKNEIQVLQQQNQELKE 

TEGKLRNTNQDLCNQMRQMVQDFDHDKQEAV 

DRCERTYQQHHEAMKTQIRESLLAKHALEKQQL 

FEAYERTHLQLRSELDKLNKEVTAVQECYLEVC 

REKDNLELTLRKTTEKEQQTQEKJKEKLIQQLEK 

EWQSKLDQTIKAMKKKTLDCGSQTDQVTTSDVI 

SKIOBMAIMIEEQKCTIQQNLEQEKDIAIKGAMKK 

LEIELELKHCENITKQVEIAVQNAHQRWLGELPE 

LAEYQALXHCAEQKKWEEQHEVSVNKRISFAVSE 

AKEKWKSELENMRKNILPGKELEEKIHSLQKELE 

LKNEEVPVVIRAELAICARSEW>OCEKQEEIHRIQE 

QNEQDYRQFLDDHRNKJNEVLAAAKEDFMKQK 

TELLLQKEIELQTCLDQSRREWTMQEAKRIQLEI 

YQYEEDILTVLGVLLSDTQKEfflSDSEDKQLLEr 

MSTCSSKWMSVQYFEKLKGCIQKAFQDTLPLLV 

ENADPEWKKRNMAELSICDSASQGTGQGDPGPA 

AGHHAQPLALQATEAE ADKKKVLEIKDLCCGHC 

FQELEKAKQECQDLKGKLEKCCRHLQHLERKHK 

AVVEKIGEENNKWEELIEENNDMKNKLEELQT 

LCKTPPRSLSAGAIENACLPCSGGALEELRGQYIK 

A\^OKCDMLRYIQESKERAAEMVKAEVL*ERQ 

ETARKMRKYYLICLQQILQDDGKEGAEKKIMNA 

ASKLATMAKLLETPISSKSQSKTTQSGMSK 


3552 


A 


771 


375 


ARTRQTSGQAREPEKESPAPGGGGLAEIRSRQQL 
SQTSRIPPLAKDQAVEAMFPPARGKELLSFEDVA 
MYFTREEWGHLNWGQKDLYRDVMLENYRNMV 
LLVYFQFDAAIPLC*TSLAHSSWLQLYFRLYF 


. 3553 


A 


76 


72 


PGVRGVEAPGGVAPGRNAMRRGERRDAGGPRP 

ESPVPAGRASLEEPPDGPSAGQATGPGEGRRSTE 

SEVYDDGTNTFFWRAHTLTVLFILTCTLGYVTLL 

EETPQDTAYNTKRGIVASILVFLCFGVTQAKDGP 

FSRPHPAYWRFWLCVSWYELFLIFILFQTVQDG 

RQFLKYVDPKLGVPLPERDYGGNCLIYDPDNET 

DPFHNTWDKLDGFVPAHFLGWYLKTLMERDWW 

MCMnSVMFEFLEYSLEHQLPNFSECWWDHWIM 

DVLVCNGLGIYCGMKTLEWLSLKTYKWQGLWN 

PTYKGKMKR1AFQFTPYSWVRFEWKPASSLRR 

WLAVCGIILVFLLAELNTFYLKFVLWMPPEHYLV 

LLRLVFFVNVGGVAMREIYDFMDDPKPHKKLGP 

QAWLVAAITATELLIVVKYDPHTLTLSLPFYISQC 

WTLGSVLALTWTYWRFFLRDITLRYKETRWQK 

WQNKDDQGSTVGNGDQHPLGLDEDLLGPGVAE 

GEGAPTPN*PRGPAPRPLPSAPRAVCGASSRR 


3554 


A 


2 


2106 


FDEFSALPSPSLQTSWSFGPMSRRALRRLRGEQR 

G QEPLGPG ALHF DLRDDDD AbEburKXbLCj V KK. 

PGGAGKEGVRVNNRFELINIDDLEDDPVVNGERS 

GCALTDAVAPGNKGRGQRGNTESKTDGDDTET 

VPSEQSHASGKLRKKKKKQKNKKSSTGEASENG 

LEDIDRILERIEDSTGLNRPGPAPLSSRKHVLYVE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino . 
acid residue of . 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D^Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIaninc, G«Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, ]>Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S^Serine, 
^Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










HRHLNPDTELKRYFGARAILGEQRPRQRQRVYP 

KCTWLTTPKSTWPRYSKPGLSMRLLESKKGLSFF 

AFEHSEEYQQAQHKFLVAVESMEPNNIVVLLQT 

SPYHVDSLLQLSDACRFQEDQEMARDLVERALY 

SMECAFHPLFSLTSGACRLDYRRPENRSFYLALY 

KQMSFLEKRGCPRTALEYCKLILSLEPDEDPLCM 

LLLEDHLALRARNYEYLIRLFQEWEVGASLAHRN 

LSQLPNFAFSVPLAYFLLSQQIDLPECEQSSARQ 

KASLLIQQALTMFPGVLLPLLESCSVRPDASVSSH 

RFFGPNAEISQPPALSQLVNLYLGRSHFLWKEPA 

TMSWLEENVHEVLQAVDAGDPAVEACENRRKV 

LYQRAPRNIHRHVBLSEIKEAVAALPPDVTTQSV 

MGFDPLPPSDTIYSYVRPERLSPISHGNTIALFFRS 

LLPNYTMEGERPEEGVAGGLNRNQGLNRLMLA 

VRDMMANFHLNDLEAPHEDDA*GEGEWD 


3555 


A 


2 


2106 


FDEFSALPSPSLQTSWSFGPMSRRALRRLRGEQR 

GQEPLGPGALHFDLRDDDDAEEEGPKRELGVRR 

PGGAGKEGVRVNNRFELINIDDLEDDPVVNGERS 

GCALTDAVAPGNKGRGQRGNTESKTDGDDTET 

WSEQSHASGKLRKKKKKQKNKKSSTGEASENG 

LEDIDRILERIEDSTGLNRPGPAPLSSRKHVLYVE 

HRHLNPDTELKRYFGARAELGEQRPRQRQRVYP 

KCTWLTTPKSTWPRYSKPGLSMRLLESKKGLSFF 

AFEHSEEYQQAQHKFLVAVESMEPNNIVVLLQT 

SPYHVDSLLQLSDACRFQEDQEMARDLVERALY 

SMECAFHPLFSLTSGACRLDYRRPENRSFYLALY 

KQMSFLEKRGCPRTALEYCKLILSLEPDEDPLCM 

LLLIDHLALRARNYEYLIRLFQEWEVGASLAHRN 

LSQLPNFAFSVPLAYFLLSQQTDLPECEQSSARQ 

KASLLIQQALTMFPGVLLPLLESCSVRPDASVSSH 

RFFGPNAEISQPPALSQLVNLYLGRSHFLWKEPA 

TMSWLEENVHEVLQAVDAGDPAVEACENRRKV 

LYQRAPRNIHRHVILSEIKEAVAALPPDVTTQSV 

MGFDPLPPSDTIYSYVRPERLSPISHGNTIALFFRS 

LLPNYTMEGERPEEGVAGGLNRNQGLNRLMLA ' 

VRDMMANFHLNDLEAPHEDDA*GEGEWD 


3556 


A 


3388 


1650 


KTRGTMFYYPNVLQRHTGCFATI WLAATRGSRL 

VKREYLRVNVVKTCEEILNYVLVRVQPPQPGLP 

RPRFSLYLSAQLQIGVTRVYSQQCQYLVEDIQHIL 

ERLHRAQLQDIIDMETELPSLLLPNHLAMMETLE 

DAPDPFFGMMSVDPRLPSPFDIPQIRHLLEAAIPE 

RVEE1PPEVPTEPREPERIPVTVLPPEAITILEAEPIR 

MLEIEGERELPEVSRRELDLLIAEEEEAILLEIPRL 

PPPAPAE*GQELLDQVGCQCWEGSPHFSCPFPLR 

VEGMGEALGPEELRLTGWEPGALLMEVTPPEEL 

RLPAPPSPERRPPVPPPPRRRRRJIRLLFWDKETQI 

SPEKFQEQLQTRAHCWECPMVQPPERTIRGPAEL 

FRTPTLSGWLPPELLGLWTHCAQPPPKALRRELP 

EEAAAEEERRKIEVPSEIEVPREALEPSVPLMVSL 

EISLEAAEEEKSRISLIPPEERWAWPEVEAPEAPA 

T PWPT7T PTJVPX/TRlV/rDT \fi DDTJT t?T T CT 13 A \/TJ"D A \7 
Litr V V rc,L>r H V r lvlJDlYLr 1^ V L>r r s^L,c,L,L,oL> li A V JrlKA V 

ALELQANREPDFSSLVSPLSPRRMAARVFYLLLV 
LSAQQILHVKQEKPYGRLLIQPGPRFH 


3557 


A 


3388 


1650 


KTRGTMFYYPNVLQRHTGCFATIWLAATRGSRL 
VKREYLRVNWKTCEEILNYVLVRVQPPQPGLP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysteine, D=Aspartic Acid, 
E-Glutamic Acid, ^Phenylalanine, G=Glycinc, H^Histidine, 
l=Isoleucinc, K=Lysine, I>Leucine, M=Methionine, 
N»Asparagine, P=ProIine, Q^GIutaminc, R«Arginine, S^Serine, 
T=Threonine, V=Valine, W=Tryptophan, V«=Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










RPRFSLYLSAQLQIGVIRVYSQQCQYLVEDIQHIL 

ERLHRAQLQIRIDMETELPSLLLPNHLAMMETLE 

DAPDPFFGMMSVDPRLPSPFDIPQIRHLLEAAIPE 

RVEEIPPEVPTEPREPERJPVTVLPPEAITILEAEPIR 

MLEEEGERELPEVSRRELDLLIAEEEEAILLEIPRL 

PPPAP AE* GQELLDQ VGCQC WEGSPHFSCPFPLR 

VEGMGEALGPEELRLTGWEPGALLMEVTPPEEL 

RLPAPPSPERRPPVPPPPRRRRRRRLLFWDKETQI 

SPEKFQEQLQTRAHCWECPMVQPPERTIRGPAEL 

FRTPTLSGWLPPELLGLWTHCAQPPPKALRKELP 

EEAAAEEERRKIEVPSEIEVPREALEPSVPLMVSL 

EISLEAAEEEKSRISLPPEERWAWPEVEAPEAPA 

LPVVPELPEVPMEMPLVLPPELELLSLEAVHRAV 

ALELQANREPDFSSLVSPLSPRRMAARVFYLLLV 

LSAQQILHVKQEKPYGRLLIQPGPRFH 


3558 


A 


489 


2360 


IRPRPRGRRRALDSPNAAAPPVYVCRSPGEPTSL 

VNMASEDIAKLAETLAKTQVAGGQLSFKGKSLK 

LNTAEDAKDVIKEIEDFDSLEALRLEGNTVGVEA 

ARVIAKAL* KKSELKRCHWSDMFTGRLRTE1PP A 

LISLGEGLITAGAQLVELDLSDNAFGPDGVQGFE . 

ALLKSSACFTLQELKLNNCGMGIGGGKILAAALT 

ECHRKSSAQGICPLALKVFVAGRNRLENDGATAL 

AEAFRVIGTLEEVHMPQNGINHPGITALAQAFAV 

NPLLRVINLNDNTFTEKGAVAMAETLKTLRQVE 

VINFGDCLVRSKGAVAIADAIRGGLPKLKELNLS 

FCEIKRDAALAVAEAMADKAELEKLDLNGNTLG 

EEGCEQLQEVLEGFNMAKVLASLSDDEDEEEEE 

EGEEEEEEAEEEEEEDEEEEEEEEEEEEEEPQQRG 

QGEKSATPSRKILDPNTGEPAPVLSSPPPADVSTF 

LAFPSPEKLLRLGPKSSVLIAQQTDTSDPEKVVSA 

FLKVSSVFKDEATVRMAVQDAVDALMQKAFNS 

SSFNSNTFLTRLLVHMGLLKSEDKVKAIANLYGP 

LMALNHMVQQDYFPKALAPLLLAFVTKPNSALE 

SCSFARHSLLQTLYKV 


3559 


A 


489 


2360 


IRPRPRGRRRALDSPNAAAPPVYVCRSPGEPTSL 

VNMASEDIAKLAETLAKTQVAGGQLSFKGKSLK 

LNTAEDAKDVIKEffiDFDSLEALRLEGNTVGVEA 

ARVTAKAL*KKSELKRCHWSDMFTGRLRTEIPPA 

LISLGEGLITAGAQLVELDLSDNAFGPDGVQGFE 

ALLKSSACFTLQELKLNNCGMGIGGGKILAAALT. 

ECHRKSSAQGKPLALKVFVAGRNRLENDGATAL 

AEAFRVIGTLEEVHMPQNGINHPGITALAQAFAV 

NPLLRVINLNDNTFIEKGAVAMAETLKTLRQVE 

VINFGDCLVRSKGAVAIADAIRGGLPKLKELNLS 

FCEIKRDAALAVAEAMADKAELEKLDLNGNTLG 

EEGCEQLQEVLEGFNMAKVLASLSDDEDEEEEE 

EGEEEEEEAEEEEEEDEEEEEEEEEEEEEEPQQRG 

QGEKSATPSRKILDPNTGEPAPVLSSPPPADVSTF 

LAFPSPEKLLRLGPKSSVLIAQQTDTSDPEKVVSA 

FLKVSSWKDEATVRMAVQDAVDALMQKAFNS 

LMALNHMVQQDYFPKALAPLLLAFVTKPNSALE 
SCSFARHSLLQTLYKV 


3560 


A 


2 


1198 


F VRELPRPRPG AATAAIMV SVINTVDTSHEDMIH 
DAQMDYYGTRLATCSSDRSVKIFDVRNGGQILIA 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E^Glutamic Acid, F=PItenylalanine, G=Glycine, H»Histidine, 
I=Isoleucine, K=Lysine, L^Leucine, M=Methionine, • 
N=Asparaginc, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W«=Tryptophan, Y=Tyrosine, 
X^TJnknown, *=Stop codon, /=possible nucleotide deletion, 
V=possib1e nucleotide insertion 










DLRGHEGPWQVAWAHPMYGNILASCSYDRKV 

nWREENGTWEKSHEHAGHDSSVNSVCWAPHDY 

GLILACG SSDG AISLLTYTGEGQ WEVKKINNAHT 

IGCNAVSWAPAVVPGSLIDHPSGQKPNYIKRFAS 

GGCDNLIKLWKEEEDGQWKEEQKLEAHSDWVR 

DVAWAPSIGLPTSHASCSQDGRVFIWTCDDASS 

NTWSPKLLHKFNDVVWHVSWSITANILAVSGGD 

NKVTLWKESVDGQWVCISDVNKGQGSVSASVT 

EGQQNEQ*QDRWGLAPHPPAPGLPLPGPTNQTT 

GKSPQLQQDYFPRRSYRCSHRLIICLNVIGDAL 


3561 


A 


540 


86 


WRVKEMTSTLPKALGRXTASRSHTTLQGGSCCP 
VLWTAKLRCRKJLRFPLPPPPPSSSAWPWQGWGI . 
RGEQEAEGPLGETGPPVGPELSGLRQWRKLIKGR 
YGE WRGSGQKTGQPS *TTMQGGETEENRTETTT 
GNKQRESEAPWVRHTYIT 


3562 


A 


1920 


242 


PMMAMPFFERFKSSIQRPSPVLVLSQNTKRESGR 

KVQSGNINAAKTTADIIRTCLGPKSMMKMLLDP 

MGGIVMTNDGNAILREIQVQHPAAKSMIEISRTQ 

DEEVGDGTTSVIILAGEMLSVAEHFLEQQMHPTV 

V1SAYRKALDDMIST1.KKISIPVDISDSDMMLNIIN 

SS1TTKAISRWSSLACNIALDAVKMVQFEENGRK 

EIDIKKYARVElQPGGUEDSCVLRGVMINKDVtH 

PRMRRYIKNPRIVLLDSSLEYKKGESQTDIEITRE 

EDFTRILQMEEEYIQQLCEDIIQLKPDVVITEKGIS 

DLAQHYLMRANTTAIRRVRKTDNNR1ARACGARI 

VSRPEELREDDVGTGAGLLEIKKIGDEYFTFITDC 

KDPKACTELLRGASKEILSEVERNFQDAMQVCRN 

VLLDPQLVPGGGASEMAVAHALTEKSKAMTGV 

EQWPYRAVAQALEVIPRILIQNCGASTDRLLTSLR 

AKHTQENCETWGVNGETGTLVDMKELGIWEPL 

AVKLQTYKTAVETAYLLLRIDDIVSGHKKKGDD 

QSRQGGAPDAGQE 


3563 


A 


1571 . 


560 


GPSLLGTRGTPNPARTLQIFFLIIGRRLTGRMAAV 

DDLQFEEFGNAATSLTANPDATTVNIEDPGETPK 

HQPGSPRGSGREEDDELLGNDDSDKTELLAGQK 

KSSPFWTFEYYQTFFDVDTYQVFDRIKGSLLPIPG 

KNFVRLYIRSNPDLYGPFWICATLVFAIAISGNLS 

NFLIHLGEKTYHYVPEFRKVSIAATIIYAYAWLVP 

LALWGFLMWRNSKVMNIVSYSFLEIVCVYGYSL 

FIYIPTA1LWIIPHKAWWILVMIALGISGSLLAMT 

FWPAVREDNRRVALATIVTIVLLHMLLSVGCLA 

YFFDAPEMDHLPTTTATPNQTVAAAKSS 


3564 


A 


1 


328 


NSRVDDFVAHLQRPLLGPASCLGILRPAMTAHSF 
ALPGIIFTTFWGLVGIAGPWFVPKGPNRGyilTML 
VATAVCCYLFWLIAILAQLNPLFGPQLKNETIWY 
VRFLWE 


3565 


A 


2 


1081 


FVTDFPARSMAATSLMSALAARLLQPAHSCSLRL 

RPFHLAAVRNEAVVISGRKLAQQIKQEVRQEVEE 

WVASGNKRPHLSVILVGENPASHSYVLNKTRAA 

AVVGINSETIMKPASISEEELLNL1MCLNNDDNVD 

GLLVQLPLPEHIDERRICNAVSPDKDVDGFHV1N 

VGRMCLDQYSMLPATPWGVWEIIKRTGEPTLGK 

NVWAGRSKNVGMPIAMLLHTDGAHERPGGDA 

TVTISHRYTPKEQLKKHT1LADIVISAAGIPNLITA 

DMIKEGAAVIDVGINRVHDPVTAKPKLVGDVDF 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to fircf 9m inn 

lu 11131 dllllllU 

acid residue of 

peptide 

sequence 


Predicted end 

nucleotide 

location 

corresponding < 
to last amino 

nrlri rpciriii£ of 

■1UU 1 C9IUUW Ul 

peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=GJycine, H=Histidinc, 
Msoleucine, K=Lysine, LHLeucine, M=Mcthionine, 
N^Asparagine, P=ProIine, Q=Glutamine, U=Arginine, S=Serine, 
T=Thrconine, V=Va1ine, W=Tryptophan, Y*=Tyrosine t 
./v uutuiuwii, -amp couuii) /— pussiuic uuucuuue oeiciion, 

\=possible nucleotide insertion 










EGVRQKAGYITPVPGGVGPMTVAMLMKNTI1AA 
KKVLRLEEREVLKSKELGVATN 


3566 


A 


3 


1130 


SCRRGRQQQRRNVSLSSQFAHTMAAPAQQTTQP 

GGGKRKGKAQYVLAKRARRCDAGGPRQLEPGL 

QGILITCNMNERKCVEEAYSLLNEYGDDMYGPE 

KFTDKDQQPSGSEGEDDDAEAALKKEVGDIKAS 

TEMRLRRFQSVESGANNVVFIRTLGIEPEKLVHHI 

LQDMYKTKKKKTRVILRMLP1SGTCKAFLEDMK 

KYAETFLEPWFKAPNKGTFQIVYKSRNNSHVNR 

EEVniELAGIVCTLNSENKVDLTNPQYTVVVEIIK 

AVCCLSVVKDYMLFRKYNLQEVVKSPKDPSQLN 

SKQGNGKEAKLESADKSDQNNTAEGKNNQQVP 

ENIEELGQTKPTSNPQVVNEGGAKPELASQATE 

GSKSNENDFS 


3567 


A 


248 


3498 


GKKDSSPWTCPFHPPLQLFFVIRNTRQLGDFHLA 
KIKVRNYWTADGDLDIG AK]ST\^LYVNRNLIFNG 
KLDKGDREAPADHSILVDQKNEKSEQLEEAMNA 
HSEESKGTHEMA G A SGDKELGLGCSPPAETLAD 
AKLSSQGNVSGKRKNSTNCRKDSLSQLEEYLRLS 
AVPTSMGDMPSAPATSPPVKCPPVHEEPSLIQQL 
ENLMGRKICEPPGKTPSWLQPSPTGKDRKQGGR 
KPKPLWLSPEKPLAWKGRLPSDDVIGEGPGETEA 
RDKGLRHEPGWGTSRSVNTKERPQRATTKVHSD 
DSDIFNQPPNRERPASGRRGSRKDAGSSSHGDDQ 
PASREDTWSSRTPSRSRWRSEQEHTLHESWSSLS 
AFDRSHRGRISNTELPGDILDELLQQKSSRHSDLP 
PSKKGEQPGLSRGQDGYSGETDAGGDFKIPVLPY 
GQRLVIDDCSTWGDRHY VGLNGBEIFSSKGEPVQI 
SNIKADPPDINILPAYGKDPRWTNLIDGVNRTQ 
DDMHVWLAPFTRGRSHS ITEDFTHPCHVALIRJ W 
NYNKSRIHSFRGVKDITMLLDTQCIFEGEIAKASG 
. TLAGAPEHFGDTILFTTDDDILEA1FYSDEMFDLD 
VGSLDSLQDEEAMRRPSTADGEGDERPFTQAGL 
GADERIPELELPSSSPVPQVTTPEPGIYHGICLQLN 
FTASWGDLHYLGLTGLEVVGKEGQALP1HLHQIS 
ASPRDLNELPEYSDDSRTLDKLIDGTNITMEDEH " 
MWLIPFSPGLDHVVTIRLDRAESIAGLRPWNYNK 
SPEDTYRGAKIVHVSLDGLCVSPPEGFLIRKGPG 
NCHFDFAQEILFVDYLRAQLLPQPARRLDMRSLE 
CASMDYEAPLMPCGFIFQFQLLTSWGDPYYIGLT 
GLELYDERGEKIPLSENNIAAFPDSVNSLEGVGG 
DVRTPDKLIDQVNDTSDGRHMWLAPILPGLVNR 
VYVIFDLPTTV SMIKL WN Y AKTPHRG VKEFGLL 
VDDLLVYNGILAMVSHLVGGILPTCEPTVPYHTI 
LFTEDRDIRHQEKHTTISNQAEDQDVQMMNENQ 
nTNAKRKQSVVDPALRPKTCISEKETRRRRC 


3568 . 


A 


50 


1724 


AQGGTLSAASRFCRGGLLGPWLHPASEMAATLD 

LKSKEEKDAELDKRIEALRRKNEALIRRYQEIEE 

DRKKAELEGVAVTAPRKGRSVEKENVAVESEKN 

LGPSRRSPGTPRPPGASKGGRTPPQQGGRAGMG 

RASRSWEGSPGEQPRGGGAGGRGRRGRGRGSPH 

LSGAGDTSISDRKSKEWEERRRQNIEKMNEEME 

KIAEYERNQREGVLEPNPVRNFLDDPRRRSGPLE 

ESERDRREESRRHGRNWGGPDFERVRCGLEHER 

QGRRAGLGSAGDMTLSMTGRERSEYLRWKQER 
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SEQH) 
NO: 


Method 


Predicted 
beginning . 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F-Phenylalanine, G^Glycine, H=Histidine, 
I-Isolcucine, K«Lysine, L=Leucme, M=Methionine, 
N= : Asparagine,P=ProIine, Q-Glutamine,R=Arginine, S^erine, 
T=Threonine, V=Valinc, W=Tryptophan, V=Tyrosine, 
X«Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










EKEDQERLQKHRKPTGQWKREWDAEKTDGMFK 

DGPVPAHEPSHRYDDQAWARPPKPPTFGEFLSQ 

HKAEASSRRRRKSSRPQAKAAPRAYSDHDDRWE 

TKEGAASPAPETPQPTSPETSPKETPMQPPE1PAP 

AHRPPEDEGEENEGEEDEEWEDISEDEEEEEIEVE 

EGDEEEPAQDHQAPEAAPTG3PCSEQAHGVPFSP 

EEPLLEPQAPGTPSSPFSPPSGHQPVSDWGEEVEL 

NSPRTTHLAGALSPGEAWPFESV 


3569 


A 


1 


912 


MGRVGRAGVQLGRRRTTWAAERTGQAAAGGP - 

GRALRGQRPDLRSGGAADSPAAGRGELYCGVLP 

RSPWFLSERRRQMADFDTYDDRAYSSFGGGRGS 

RGSAGGHGSRSQKELPTEPPYTAYVGNLPFNTV 

QGDIDA1FKDLSIRSVRLVRDKDTDKFKGFCYVE 

FDEVDSLKEALTYDGALLGDRSLRVD1AEGRKQ 

DKGGFGFRKGGPDDRGFRDDFLGGRGGSRPGDR 

RTGPPMGSRFRDGPPLRGSNMDFREPTEEERAQR 

PRLQLKPRTVATPLNQVANPNSAIFGGARPREEV 

VQKEQE 


3570 


A 


1 . 


912 


MGRVGRAGVQLGRRRTTWAAERTGQAAAGGP 

GRALRGQRPDLRSGGAADSPAAGRGELYCGVLP 

RSPWFLSERRRQMADFDTYDDRAYSSFGGGRGS 

RGSAGGHGSRSQKELPTEPPYTAYVGNLPFNTV 

QGDIDAIFKDLSIRSVRLVRDKDTDKFKGFCYVE 

FDEVDSLKEALTYDGALLGDRSLRVDIAEGRKQ 

DKGGFGFRKGGPDDRGFRDDFLGGRGGSRPGDR 

RTGPPMGSRFRDGPPLRGSNMDFREPTEEERAQR 

PRLQLKPRTVATPLNQVANPNSAIFGGARPREEV 

VQKEQE. 


3571 


A 


28 


.131 


RHFFGNLCAMRAKWRKKRMRRLKRKRRKMRQ 
RSK 


3572 


A 


3 


1202 


QSEPHRKVRVDPPVRDRPPPHPPPLLVQRALPGQ 

GQAEGSDGADGAKRRAMAHQTGIHATEELKEFF 

AKARAGSVRLIKWIEDEQLVLGASQEPVGRWD 

QDYDRAVLPLLDAQQPCYLLYRLDSQNAQGFE 

WLFLAWSPDNSPVRLKMLYAATRATVKKEFGG 

GHIKDELFGTVKDDLSFAGYQKHLSSCAAPAPLT 

SAERELQQIRINEVKTEISVESKHQTLQGLAFPLQ 

PEAQRALQQLKQKMVNYIQMKLDLERETIELVH 

TEPTDVAQLPSRVPRDAARYHFFLYKHTHEGDP ' 

LESWFIYSMPGYKCSIKERMLYSSCKSRLLDSV 

EQDFHLEIAKKIEIGDGAELTAEFLYDEVHPKQH 

AFKQAFAKPKGPGGKRGHKRLIRGPGENGDDS 


3573 

* \ 


A 


49 


1869 


PHCEPNPGAGAMVLLHVLFEHAVGYALLALKEV 
EEISLLQPQVEESVLNLGKFHSIVRLVAFCPFASS 
QVALENANAVSEGVVHEDLRLLLETHLPSKKKK . 
VLLGVGDPKIGAAIQEELGYNCQTGGVIAEILRG 
VRIiDTH^VKGLTDLSACKAQLGLGHSYSRAKV 
KFNVNRVDNMIIQSISLLDQLDKDINTFSMRVRE 
. WYGYHFPELVKITNDNATYCRLAQFIGNRRELNE 
DKLEKLEELTMDGAKAKAILDASRSSMGMDISAI 
DLDSflESFSSRVVSLSEYRQSLHTYLRSKMSQVAP 
SLSALIGEAVGARLIAHAGSLTNLAKYPASTVQIL 
GAEKALFRALKTRGNTPKYGLIFHSTFIGRAAAK 
NICGRISRYLANKCSIASRIDCFSEVPTSVFGEKLR 
EQVEERLSFYETGEIPRKNLDVMKEAMVQAEAE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence . 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, GKSlycinc, H=Histidine, 
I=lsoleucinc, K^Lysine, L=Leucine, M=Methionine, 
N=Asparaginc, P^ProIinc, Q=Glutaraine, R=Arginine, S^Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y^Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










EAAAblTKXJLEKQEKifURL^ 
ENSSSTPEECEETSEKPKKKKKQKPQEVPQENGM 
EDPSISFSKPKKKKSFSKEELMSSDLEETAGSTSIP 
KRKKSTPKEETVMDPEEAGHRSRSKKKRKFSKEE 
P VS SGPEE A VGKS S SKKKKKFHKASQED 


3574 


A 


284 


2032 


CGNERTARLWVQPVVSTMPQASEHRLGRTREPP 

VNIQPRVGSKLPFAPRARSKERROTASGPNPMLR 

PLPPRPGLPDERLKKLELGRGRTSGPRPRGPLRA 

DHGVPLPGSPPPTVALPLPSRTNLARSKSVSSGDL 

RPMGIALGGHRGTGELGAALSRLALRPEPPTLRR 

STSLRRLGGFPGPPTLFSDtTEPPASHGSFHMISAR 

SSEPFYSDDKMAHHTLLLGSGHVGLRNLGNTCF 

LNAVLQCLSSTRPLRDFCLRRDFRQEVPGGGRA 

QELTEAFADVIGALWHPDSCEAVNPTRFRAVFQ 

KYVPSFSGYSQQDAQEFLKLLMERLHLEINRRGR 

RAPPILANGPVPSPPRRGGALLEEPELSDDDRANL 

MWKRYLEREDSKIVDLFVGQLKSCLKCQACGY 

RSTTFEVFCDLSLPIPKKGFAGGKVSLRDCFNLFT 

KEEELESENAPVCDRGRQKTRSTKKLTVQRFPRI 

LVLHLNRFSASRGSIKKSSVGVDFPLQRLSLGDF 

ASDKAGSPVYQLYALCNHSGSVHYGHYTALCR 

CQTGWHVYNDSRVSPVSENQVASSEGYVLFYQL 

MQEPPRCL 


3575 


A 


1 


2408 


RELDSLADLPERIKPPYANGLSTSHLRSSSVEDVK 

LIISEGRPTIEVRRCSMPSVICEHTKQFQTISEESN 

QGSLLTVPGDTSPSPKPEVFSNVPERDLSNVSNIH 

SSFATSPTGASNSKYVSADRNLIKNTAPVNTVMD 

SPVHLEPSSQVGVIQNKSWEMPVDRLETLSTRDF 

ICPNSNIPDQESSLQSFCNSENKVLKENADFLSLR 

QTELPGNSCAQDPASFMPPQQPCSFPSQSLSDAES 

ISKHMSLSYVANQEPGILQQKNAVQIISSALDTD 

NESTKDTENTFVLGDVQKTDAFVPVYSDSTIQEA 

SPNFEKAYTLPVLPSEKDFNGSDASTQLNTHYAF 

SKLTYKS S S GHE VENS TTDTQ VISHEKENKLES L 

VLTHLSRCDSDLCEMNAGMPKGNLNEQDPKHC 

PESEKCLLSIEDEESQQSELSSLENHSQQSTQPEM 

HKYGQLVKVELEENAEDDKTENQIPQRMTRNK 

ANTMANQSKQELASCTLLSEKDSESSSPRGRIRLT 

EDDDPQIHHPRKRKVSRVPQPVQVSPSLLQAKEK 

TQQSLAAIVDSLKLDEIQPYSSERANPYFEYLHIR 

KKIEEKRKLLCSVIPQAPQYYDEYVTFNGSYLLD 

GNPLSKICIPTITPPPSLSDPLKELFRQQEVVRMKL ' 

RLQHSIEREKLIVSNEQEVLRVHYRAARTLANQT 

LPFSACTVLLDAEVYNVPLDSQSDDSKTSVRDRF 

NARQFMSWLQDVDDKFDKLKTCLLMRQQHEA 

AALNAVQRLEWQLKLQELDPATYKSISIYEIQEF 

YVPLVDVNDDFELTPI 


3576 


A 


5 


1421 


LRLAWHDGARWPLGTPRAAATRREAAALPPVT ~ 
LALLCLDGVFLSSAENDFVHRIQEELDRFLLQKQ 

LSKVLLFPPLSSRLRYLIHRTAENFDLLSSFSVGE 

GWKR'RTVTPHO'nTRVPCi^nnT ^rtPPT? APAQPPQP 
vj wjviNAi v iv^nv^uiix v x ooi-'vJl^ovJr v^Ivftx Aov^roK. 

YHGPRPISNQGAAAVPRGARAGRWYRGRKPDQ 
PLYVPRVLRRQEEWGLTSTSVLKREAPAGRDPEE 
PGDVGAGDPNSDQGLPVLMTQGTEDLKGPGQR 
CENEPLLDPVGPEPLGPESQSGKGDMVEMATRF 
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SEQW 
NO: 


Method 


Predicted * 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc OCysteine, D-Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=G lycine, H=Histidinc, 
Msoleucine, K^Lysine, L^Leucine, M^Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, &=Serine, 
T=Threonine, V=VaIine, W«Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop cod on, /=possible nucleotide deletion, 
\- possible nucJeotide insertion 










GSTLQLDLEKGKESLLEKRLVAEEEEDEEEVEED 
GPSSCSEDDYSELLQEITDNLTKKEIQIEKIHLDTS 
SFMEELPGEKDLAHVVEIYDFEPALKTEDLLATF 
SEFQEKGFRIQWVDDTHALGIFPCRAS AAEALTR 
EFSVLKIRPLTQGTKQSKLKALQRPKLLRLVKER 
PQTNATVARRLVARALGLQHKKKERPAVRGPLP 
P 


3577 


A 


102 


1998 


DTRTPGSLEMGPLQFRDVAIEFSLEEWHCLDTAQ 

RNLYRKVMLENYSNLVFLGIWSKPDLIAHLEQG 

KKPLTMKRHEMVANPSGPVICSHFAQDLWPEQN 

IKDSFQKVILRRYEKRGHGNLQLIKRCESVDECK 

VHTGGYNGLNQCSTTTQSKVFQCDKYGKVFHK 

FSNSNRHN1RHTEKKPFKCIECGKAFKQFSTLITH 

KXfflTGEKPYICEECGKAFKYSSALNTHKRIHTG 

EKPYKCDKCDKAFIASSTLSKHEIIHTGKKPYKCE 

ECGKAFNQSSTLTKHKKEHTGEKPYKCEECGKAF 

NQSSTLTKHKKIHTGEKP YVCEECGKAFKYSRIL 

TTHKRIHTGEKPYKCNKCGKAFIASSTLSRHEFIH 

MGKKHYKCEECGKAFIWSSVLTRHKRVHTGEKP 

YKCEECGKAFKYSiSTLSSHKRSHTGEKPYKCEEC 

GKAFVASSTLSKHEIIHTGKKPYKCEECGKAFNQ 

SSSLTKHKKIHTGEKPYKCEECGKAFNQSSSLTK 

HKKIHTGEKP YKCEECG KAFNQ S STLIKHKKIHT 

REKPYKCEECGKAFHLSTHLTTHKILHTGEKPYR 

CRECGKAFTsTHSATLSSHKKIHSGEKPYECDKCG 

KAFISPSSLSRHEHHTGEKP 


3578 


■A 

■\) . 


1725 


445 ■ 


RPRRRGTHHFSCVLGSFRVSAMFPRVSTFLPLRP 

LSRHPLSSGSPETSAAAIMLLTVRHGTVRYRSSA 

LLARTKNMQRYFGTNSVICSKKDKQSVRTEETS 

KETSESQDSEKENTKKDLLGIIKGMKVELSTVNV 

RTTKPPKRRPLKSLEATLGRLRRATEYAPKKRIEP 

LSPELVA AA SA VADSLPFDKQTTKSELLSQLQQH 

EEESRAQRDAKRPKISFSNnSDMKVARSATARV 

RSRPELRIQFDEGYDNYPGQEKTDDLKKRKNIFT 

GKRLNIFDMMAVTKEAPETDTSPSLWDVEFAKQ 

LATVOTQPLQNGFEELIQWTKEGKLWEFPINNEA 

GFDDDGSEFHEHIFLEKHLESFPKQGPIRHFMELV 

TCGLSKNPYLSVXQK\HEHIEWFRNYFNEKKDILK 

ESNIQFKLRPWKFLFRNN . 


3579 


A 


1725 : 


445 


RPRRRGTHHFSCVLGSFRVSAMFPRVSTFLPLRP 

LSRHPLSSGSPETSAAAIMLLTVRHGTVRYRSSA : 

LLARTKNMQRYFGTOSVICSKKDKQSVRTEETS 

KETSESQDSEKENTKKDLLGriKGMKVELSTVNV . 

RTTKPPKRRPLKSLEATLGRLRRATEYAPKKRIEP 

LSPELVAAASAV ADSLPFDKQTTK.SELLSQLQQH . 

EEESRAQRDAKRPKISFSNIISDMKVARSATARV 

RSRPELRIQFDEG YDNYPGQEKTDDLKKRKNIFT 

GKRLNJFDMMAVTKEAPETDTSPSLWDVEFAKQ 

LATVNEQPLQNGFEELIQWTKEGKLWEFPINNEA 

GFDDDGSEFHEHIFLEKHLESFPKQGPIRHFMELV 

TPp.T CT^VfpVT QVT^n^XrRUTPWPPXTVTTXTCT^V'riTT V 
1 V/Vji^orsjNr I l_,o V ISA^JtV V J&JrLLtl WJr Jt\l\ I rXNjGlSJ\X/lL/J\. 

ESNIQFKLRPWKFLFRNN 


3580 


A 


3673 


1619 


LYCVAPYSRHLLG11MSHLPMKLLRKKIEKRNLK 
LRQRNLKFQGASNLTLSETQNGDVSEETMGSRK 
VKKSKQKPMNVGLSETQNGGMSQEAVGNDCVT 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc OCystcinc, D=Aspartic Acid, 
E=Glatamic Acid, ^Phenylalanine, G=Glycine, H-Histidine, 
Msoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q^GIutamine, R=Arginine, S=Scrine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *«Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 










KSPQKSTVLTNGEAAMQSSNSESKKKKKKKRK 
■ MV>ODAEPDTKKAKTENKGKSEEESAETTKETEN 
NVEKPDNDEDESEVPSLPLGLTGAFEDTSFASLC 
NLVNENTLKAIKEMGFTNMTEIQHKSIRPLLEGR 
DLLAAAKTGSGKTLAFLIPAVELIVKLRFMPRNG 
TGVLILSPTRELAMQTFGVLKELMTHHVHTYGLI 
MGGSNRSAEAQKLGNGINIIVATPGRLLDHMQN 
TPGFMYKNLQCLVIDEADRILDVGFEEELKQIIKL 
LPTRRQTMLFSATQTRKVEDLARISLKKEPLYVG 
VDDDKANATVDGLEQGYWCPSEKRFLLLFTFL 
KKNRKKKLMVFFSSCMSVKYHYELLNYIDLPVL 
AHGKQKQNKRTTTFFQFCNADSGTLLCTDVAA 
RGLDIPEVDWIVQYDPPDDPKEY1HRVGRTARGL 
NGRGHALLILRPEELGFLRYLKQSKVPLSEFDFS 
WSK1SDIQSQLEKLIEKNYFLHKSAQEAYKSYIRA 
YDSHSLKQIFNVNNLNLPQVALSFGFKVPPFVDL 
NVNSNEGKQKKRGGGGGFGYQKTKKVEKSKIF 
KHISKKSSDSRQFSH 




A 

A 


23 


453 


LCRCICIKNITPHCLWDKVLSQFTYILDNLSNFMS 

IfflPHSLRNSCLIRMDLLYWQFTTYTITFCFSHLSG 

RLTLSAQfflSHRPCLLSYSLLFWKVHHLFLEGFPC 

SPRLDEMSFHQFPQHPVHVSWHLPIVYKGSMT 

QVSPH 


3582 




3 


950 


TRGCGNKMAGKKNVLSSLAVYAEDSEPESDGEA 

GIEAVGSAAEEKGGLVSDAYGEDDFSRLGGDED 

GYEEEEDENSRQSEDDDSETEKPEADDPKDNTE 

AEJOIDPQELVASFSERVRNMSPDEIKJPPEPPGRC 

SNHLQDjOQKL\^RKIKEGMDMNYnQRKKEFRN 

PSIYEKLIQFCAIDELGTbTYPKDMFDPHGWSEDS 

YYEALAKAQKIEMDKLEKAKKERTKIEFVTGTK 

KGTTTNATSTTTTTASTAVADAQKRKSKWDSAI 

PVTTIAQPTILTTTATLPAVVTVTTSASGSKTTVIS 

AVGTIVKKAKQ 


3583 


A 


3 


950 


TRGCGNKMAGKKNVLSSLAVYAEDSEPESDGEA 

GIEAVGSAAEEKGGLVSDAYGEDDFSRLGGDED 

GYEEEEDENSRQSEDDDSETEKPEADDPKDNTE 

AEKRDPQELVASFSERVRNMSPDEDCIPPEPPGRC 

SNHLQDKIQKLYERKIKE^ 

PSIYEKLIQFCAIDELGTNYPKDMFDPHGWSEDS 

YYEALAKAQKJEMDKLEKAKKERTKIEFVTGTK 

KGTTTNATSTTTTTASTAVADAQKRKSKWDSAI 

PVTTIAQPTILTTTATLPAWTVTTSASGSKTTVIS 

AVGTIVKKAKQ 


3584 


A 


3- 


1139 


PGSTISSRADRLGAPVLAHPKMAERQEEQRGSPP 

LRAEG'KADAEVKLILYHWTHSFSSQKVRLVIAE 

KALKCEEHDVSLPLSEHNEPWFMRLNSTGEVPV 

LIHGENIICEATQIIDYLEQTFLDERTPRLMPDKES 

MYYPRVQHYRELLDSLPMDAYTHGCILHPELTV 

DSMIPAYATTRIRSQIGNTESELKKLAEENPDLQE 

AYIAKQKRLKSKLLDHDNVKYLKKILDELEKVL • 

LAVTLHRLKFLGFARRNWGNGKRPNLETYYERV 
LKRKTFmVLGHVNNILISAVLPTAFRVAKXRAP 
KVLGTTL WGLLAGVGYFAFMLFRKRLGSMILA 
LRPRPNYF 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
^Glutamic Acid, F=PhenyIalanine, (^Glycine, H=Histidine, 
I=IsoIeucine, K==Lysine, Lr^Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q=G)utaminc, R^Arginine, S=Serine, 
T^Threoninc, V-Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop cod on, ^possible nucleotide deletion, 
V=possible nucleotide insertion 


3585 


A 


1 


1777 


RRHSPGSPAFAPSSRATAICPRAARAPATLLLALG 

AVLWPAAGAWELTILHTNDVHSRLEQTSEDSSK 

CVNASRCMGGVARLFTKVQQIRRAEPNVLLLDA 

GDQYQGTIWFTVYKGAEVAHFMNALRYDAMA 

LGNHEFDNGVEGLIEPLLKEAKFPILSANIKAKGP 

LASQISGLYLPYKVLPVGDEWGIVGYTSKETPF 

LSNPGTNLVFEDEITALQPEVDKLKTLNVNiaiAL 

GHSGFEMDKL1AQKVRGVDWVGGHSNTFLYT 

GNPPSKEVPAGKYPFIVTSDDGRKVPVVQAYAF 

GKYLGYLKIEFDERGNVISSHGNPILLNSSIPEDPS 

IKADINKWRIKLDNYSTQELGKTIVYLDGSSQSC 

RFRECNMGNLICDAMI>MNLRHTDEMFWNHVS 

MCH.NGGGIRSPIDERNNGTITWENLAAVLPFGG 

TFDLVQLKGSTLKKAFEHSVHRYGQSTGEFLQV 

GGIHVVYDLSRKPGDRVVKLDVLCTKCRVPSYD 

PLKMDEVYKVILPNFIANGGDGFQMIKDELLRH 

DSGDQDINVVSTYISKMKVIYPAVEGRIKFSTGS 

HCHGSFSLIFLSLWAVIFVLYQ 


3586 


A 


1399 


881 


LSNKDVLSPQLKDENSKLRRKLNEVQSFSEAQTE 

MVRTLERKLEAKMIKEESDYHDLESVVQQVEQN 

LELMTKRAVKAENHVVKLKQEISLLQAQVSNFQ 

RENEALRCGQGASLTWKQNADVALQNLRVVM 

NSAQASIEQLVSGAETLNLVAEILKSIDRISEVKD 

EEEDS 


3587. 


A 


88 


1639 


GCVGRGLPLPPRHPTPPSSSSSPFVLLAFLLLVRL 

DPAVSGKMAAPRPPPARLSGVMVPAPIQDLEAL 

RALTALFKEQRNRETAPRTIFQRVLDILKKSSHA 

VELACRDPSQVENLASSLQLITECFRCLRNACIEC 

SVNQNSIKNLDTIGVAVDLILLFRELRVEQESLLT 

AFRCGLQFLGNIASRNEDSQS1VWVHAFPELFLS 

CLNHPDKKIVAYSSMILFTSLNHERMKELEENLN 

IAEDVIDAYQKHPESEWPFLIITDLFLKSPELVQA 

MFPKLNNQERVTLLDLMIAKITSDEPLTKDDIPVF 

LRHAELIASTFVDQCKTVLKLASEEPPDDEEALA 

TIRLLDVLCEMTVNTELLGYLQVFPGLLERVIDL 

LRVIHVAGKETTNIFSNCGCVRAEGDISNVANGF 

KSHLIRLIGNLCYKNKDNQDKVNELDGIPLILDN 

CMSDSWLTQWVIYAKNfLIEDNSQNQDLIAK 

MEEQGLADASLLKKVGFEVEKKGEKLILKSTRD 

TPKP . 


3588 


A 


3'- ■ 


1462 


DSPRNRFEILGRPTRTPTRPGPRPAMEDLDALLSD 

LETTTSHMPRSGAPKERPAEPLTPPPSYGHQPQT 

GSGESSGASGDKDHLYSTVCKPRSPKPAAPAAPP 

FSSSSGVLGTGLCELDRLLQELNATQFNITDEIMS 

QFPSSKVASGEQKEDQSEDKKRPSLPSSPSPGLPK 

ASATSATLELDRLMASLSDFRVQNHLPASGPTQP 

PVYSSTNEGSPSPPEPTGKGSLDTMLGLLQSDLSR 

RGVPTQAKGLCGSCNKPIAGQWTALGRAWHPE 

HFVCGGCSTALGGSSFFEKDGAPFCPECYFERFSP 

RCGFCNQPIRHKMVTALGTHWHPEHFCCVSCGE 

ILDNYISALSALWHPDCFVCRECFAPFSGGSFFEH 
EGRPLCENHFHARRGSLCATCGLPVTGRCVSAL 
GRRFHPDHFTCTFCLRPLTKGSFQERAGKPYCQP 
CFLKLFG 
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SEQID 
NO: 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino acid sequence (A^Alanine OCysteine, D=Aspartic Acid, 
E<=Glutamic Acid, F=PhenylaIanine, G^GIycinc, H^Histidino, 
I-Isoleucine, K^Lysine, L=Leucine, M^Methionine, 
N=Asparagine, P^ProIine, Q=Glu (amine, R^Arginine, S=Serine, 
T=Threonine, V=VaJine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 



3589 



226 



6793 



SPPKKSRKCNLSFRLISAERWRFFLLILMEMPRKP 

RLTLFVQRRIENIATEREFDPEEF YYLLEAAEGHA 

KEGQGIKTDIPRYIISQLGLNKDPLEEMAHLGNy 

DSGTAETPETDESVSSSNASLKLRRKPRESDFETI 

KLISNGAYGAWFVRHK^SRQIU^AMKIGNKQN^ 

ILRNQIQQAFVERDILTFAENPFWSMYCSFETRR 

HLCMVMEYVEGGDCATLMKNMGPLPVDMLARM 

YFAETVLALEYLHNYGIVHRDLKPDNLLVTSMG 

HIKLTDFGLSKVGLMSMTTNLYEGHIEKDAREFL . 

DKQVCGTPEY1APEVILRQGYGKPVDWWAMGII 

LYEFLVGCVPFFGDTPEELFGQVISDEINWPEKDE 

APPPDAQDLITLLLRQNPLERLGTGGAYEVKQHR 

FFRSLDWNSLLRQKAEFIPQLESEDDTSYFDTRSE 

KYHHMETEEEDDTNDEDFNVEIRQFSSCSHRFSK 

VFSSIDRITQNSAEEKEDSVDKTKSTTLPSTETLS 

WSSEYSEMQQLSTSNSSDIESNRHKLSSGLLPKL 

AISTEGEQDEAASCPGDPHEEPGKPALPPEECAQ 

EEPEVTTPASTISSSTLSVGSFSEHLDQINGRSECV 

DSTDNSSKPSSEPASHMARQRLESTEKKKISGKV 

TKSLSASALSLMIPGDMFAVSPLGSPMSPHSLSSD 

PSSSRDSSPSRDSSAASASPHQPIVmSSGKNYGFT 

IRAIRVYVGDSDIYTVHHIVWNVEEGSPACQAGL 

KAGDLITHINGEPVHGLVHTEVIELLLKSGNKVSI 

TTTPFENTSnCTGPAIlRNSYKSRMVRRSKKSKkK 

ESLERRRSLFKKLAKQPSPLLHTSRSFSCLNRSLS 

SGESLPGSPTHSLSPRSPTPSYRSTPDFPSGTOSSQ 

SSSPSSSAPNSPAGSGHIRPSTLHGLAPKLGGQRY 

RSGRRKSAGNIPLSPLARTPSPTPQPTSPQRSPSPL : 

LGHSLGNSKIAQAFPSKMHSPPTrVRHIVRPKSAE 

PPRSPLLKRVQSEEKLSPSYGSDKKHLCSRKHSL 

EVTQEEVQREQSQREAPLQSLDENVCDVPPLSRA 

RPVEQGCLKRPVSRKVGRQESVDDLDRDKLKAK 

VVVKKADGFPEKQESHQKFHGPGSDLENFALFK 

LEEREKKVYPICAVERSSTFENKASMQEAPPLGSL 

LKDALHKQASVRASEGAMSDGPVPAEHRQGGG 

DFRRAPAPGTLQDGLCHSLDRGISGKGEGTEKSS 

QAKELLRCEKLDSKLAMDYLRKKMSLEDKEDN 

LCPVLKPKMTAGSHECLPGNPVRPTGGQQEPPPA 

SESRAFVSSTHAAQMSAVSFVPLKALTGRVDSGT 

EKPGLVAPESPVRKSPSEYKLEGRSVSCLEPIEGT 

LDIALLSGPQASKTELPSPESAQSPSPSGDVRASV 

PPVLPSSSGKKNDTTSARELSPSSLKMNKSYLLEP 

WFLPPSRGLQNSPAVSLPDPEFKRDRKGPHPTAR 

SPGTVMESNPQQREGSSPKHQDHTTDPKLLTCLG 

QNLHSPDLARPRCPLPPEASPSREKPGLRESSERG 

PPTARSERSAARADTCREPSMELCFPETAKTSDN 

SKNLLSVGRTHPDFYTQTQAMEKAWAPGGKTN 

HKDGPGEARPPPRDNSSLHSAGEPCEKELGKVRR 

GVEPKPEALLARRSLQPPGIESEKSEKLSSFPSLQ 

KDGAKEPERKEQPLQRHPSSIPPPPLTAKDLSSPA 

ARQHCSSPSHASGREPGAKPSTAEPSSSPQDPPKP 

VAAHSESSSHKPRPGPDPGPPKTKHPDRSLSSQK 

PSVGATKGKEPATQSLGGSSREGKGHSKSGPDVF 

PATPGSQNKASDGIGQOEGGPSVPLHTDRAPLDA 

KPQPTSGGRPLEVLEKPVHLPRPGHPGPSEPADQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
^nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCysteine, D=Aspartic Acid, 
E^Glutamic Acid, ^Phenylalanine, (^Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S^Serine, ■ 
^Threonine, V«Valine, W=Tryptophan, Y=Tyrosine, 
X==Unknowi, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










KLSAVGEKQTLSPKHPKPSTVKDCPTLCKQTDN 

RQTDKSPSQPAANTDRRAEGKKCTEALYAPAEG 

DKLEAGLSFVHSENRLKGAERPAAGVGKGFPEA 

RGKGPGPQKPPTEADKPNGMKRSPSATGQSSFRS 

TALPEKSLSCSSSFPETRAGVREASAASSDTSSAK 

AAGGMLELPAPSNRDHRKAQPAGEGRTHMTKS 

DSLPSFRVSTLPLESHHPDPNTMGGASHRDRALS 

VTATVGETKGKDPAPAQPPPARKQNVGRDVTKP 

SPAPNTDRPISLSNEKDFVVRQRRGPCESLRSSPHK 

KAL 


3590 


A 


3 . . 


935 , . 


RATTRPKNEVQDYVSVEYLSPHMGGTDPFKYSY 

PPLVDDDFQTPLCENGPITSEDETSSKEDIESDGK 

ETLETISNEEQTPLLKKINPTESTSKAEENEKVDS 

KVKAFKKPLSVFKGPLLHISPAEELYFGSTESGEK 

KTLIVLTNVTKNrVAFKVRTTAPEKYRVKPSNSS 

CDPGASVDIVVSPHGGLTVSAQDRFLIMAAEME 

QSSGTGPAELTQFWKEWRNKVMErmLRCHTVE 

SSKPNTLTLKDNAFNMSDKTSEDICLQLSRLLES 

NRKLEDQVQRCIWFQQLLLSLTMLLLAFVTSFFY 

LLYS 


3591. 


A 


303 


2 


GGSWGPLCPVSPAMSLSDPGLGYHPTCWTLRWP 

PLCSLHALHVFHCLFSSRLGTPVSPRLAMDPNCS 

CEAGGSCACAGSCKCKKCKCTSCKKSCCSCCPL 


3592 


A 


1052 


1779 


GKTMMRKMLLAAALSVTAMTAHADYQCSVTP 

RDDVIVSPQTVQVKGENGNLVITPDGNVMYNGK 

QYSLNAAQREQAKDYQAELRSTLPWIDEGAKSR 

A^KARIALDKIIVQEMGESSKMRSRLTKLDAQVK 

EQMNRI1ETRSDGLTFHYKAIDQVRAEGQQLVNQ 

AMGGILQDSINEMGAKAVLKSGGNPLQNVLGSL 

GGLQSS1QTEWKKQEKDFQQFGKDVCSRVVTLE 

DSRKALVGNLK 


3593 


A 


3 


1837 


LSFEKVDIQTDNDLTKEMYEGKENVSFELQRDFS . 

QETDFSEASLLEKQQEVHSAGNIKKEKSNTIDGT 

VKDETSPVEECFFSQSSNSYQCHTITGEQPSGCTG 

LGKSISFDTICLVKHEIINSEERPFKCEELVEPFRCD 

SQLIQHQENNTEEKPYQCSECGKAFSINEKLIWH 

QRLHSGEKPFKCVECGKSFSYSSHYITHQTIHSGE 

KPYQCKMCGKAFSYNGSLSRHQRIHTGEICPYQC 

KECGNGFSCSSAYTTHQRVHTGEKPYECNDCGK 

AFNGNAKLIQHQRIHTGEKPYECNECGKGFRCSS 

QLRQHQSIHTGEKPYQCKEGGKGFNNNTKLIQH 

QRIHTASLAEQLFKASGNHPNWGCCLTISSPGPS 

VYGPKMNMRGAPNSRLAGGREKRTQDTDFGQC 

SFLPSHSPSCFEPWNVTDYDSSWYRQKQVLSGV 

WSSPLSILKLPRTLIRISIfflQEMDTPGEMLMTGR 

GSLGPTLTTEAPAAAQPGKQGPPGTGRCLQAPGT 

EPGEQTPEGARELSPLQESSSPGGVKAEEEQRAG 

AEPGTRPSLARSDDNDHEVGALGLQQGKSPGAG 

NPEPEQDCAARAPVRAEAVRRMPPGAEAGSVVL 

DD 


3594 




39 


**\) i 


p A a A/TTV/iTvrQp vrvprvT a rv/Tvvr r^p Tn Qr)nr\r^T/~i 
tsj\jASVuvLLJ i oXv v \£r irJLi/vl V iiv V lAxix 1 0 ov^Ov^L- i 1^ 

VRVEFMDDTSRSIIRSVKGPVREGDVLTLLESERE 
ARRLR 


3595 


A 


973 


68 


GRVGTKHQMADDAGAAGGPGGPGGPGMGNRG 
GFRGGFGSGIRGRGRGRGRGRGRGRGARGGKAE 
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SEQ ID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 

corresponding 1 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine OCysteine, D^Aspartic Acid, 
E=GIutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, L= Leu cine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutaminc, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possibIe nucleotide insertion 










DKEWMPVTKLGRLVKDMKIKSLEEIYLFSLPIKE 

SEnDFFLGASLKDEVLKIMPVQKQTRAGQRTRF 

KAFVAIGDYNGHVGLGVKCSKEVATAIRGAIILA 

KLSIVPVRRGYWGNKIGKPHTVPCKVTGRCGSV 

LVRLIPAPRGTGIVSAPVPKKLLMMAGIDDCYTS 

ARGCTATLGNFAKATFDAISKTYSYLTPDLWKE 

TVFTKSPYQEFTDHLVKTHTRVSVQRTQAPAVA 

TT 


3596 


A 


106 


2960 


DERRVGAADMFGRSRSWVGGGHGKTSRNIHSL ■ 

DHLKYLYHVLTXNTTVTEQNRNLLVETIRSITEIL 

IWGDQNDSSVFDFFLEKNMFVFFLNILRQKSGRY 

VCVQLLQTLNILFENISHETSLYYLLSNNYVNSII 

VHKFDFSDEEIMAYYISFLKTLSLI<JLNNHTVPiFF 

YNEHTNDFALYTEAIKFFNHPESMVRIAVRTITL 

NVYKVSLDNQAMLHYIRDKTAVPYFSNLVWFIG 

SHVffiLDDCVQTDEEHRNRGKLSDLVAEHLDHL 

HYLNDILIINCEFLNDVLTDHLLNRLFLPLYVYSL 

ENQDKGGERPKISLPVSLYLLSQVFLIIHHAPLVN 

SLAEVILNGDLSEMYAKTEQDIQRSSAKPSIRCFI 

KPTETLERSLEMNKHKGKRRVQKRPNYKNVGEE 

EDEEKGFIEDAQEDAEKAKGTEGGSKGIKTSGES 

EEIEMVnviERSKLSELAASTSVQEQNTTDEEKSA 

AATCSESTQWSRPFLDMVYHALD$PDDDYHALF 

VLCLLYAMSHNKGMDPEKLERIQLPVPNAAEKT 

TYNHPLAERLIRIMNNAAQPDGKIRLATLELSCL 

LLKQQVLMSAGCIMKDVHLACLEGAREESVHLV 

RHFYKGEDIFLDMFEDEYRSMTMKPMNVEYLM 

MDASILLPPTGTPLTGIDFVKRLPCGDVEKTRRAI - 

RVFFMLRSLSLQLRGEPETQLPLTREEDLDCTDDV 

LDL>TNSDLIACTVITKDGGMVQRSLAVDIYQMS 

LVEPDVSRLGWGVVKFAGLLQDMQVTGVEDDS 

RALNIT1HKPASSPHSKPFPILQATFIFSDHIRCIIAK 

QRLAKGR1QARRMKMQRIAALLDLPIQPTTEVLG 

FGLGSSTSTQHLPFRFYDQGRRGSSDPTVQRSVF 

ASVDKVPGFAVAQCINEHSSPSLSSQSPPSASGSP 

SGSGSTSHCPSGGTSSSSTPSTAQSPAGIGHVTQ 


3597 


A 


427 


-277 


GVRRIQHHWAQMHECNVHTYASLFCLFLLHTG 
KLCCLNSHRHFHCIKYSK 


3598 


A 


1. 


503 


FRPRTKKATAMYLEHYLDSIENLPCELQRNFQL 

MRELDQRTEDKKAEIDILAAEYISTVKTLSPDQR 

VERLQKJQNAYSKCBCEYSDDKVQLAMQTYEMV 

DKHIRRLDADLARFEADLKDKMEGSDFESSGGR. 

GLKKGRGQKEKRGSRGRGRRTSEEDTPICKKKH 

KGG 


3599 


A 


2 


3907 


KTITALAFSPDGKYLVTGESGHMPAVRVWDVAE 

HSQVAELQEHKYGVACVAFSPSAKYIVSVGYQH 

DMIVNVWAWKKNIVVASNKVSSRVTAVSFSED 

CSYFVTAGNRHIKFWYLDDSKTSKVNATVPLLG 

RSGLLGELRNNLFTDVACGRGICKADSTFCITSSG 

LLCEFSDRRLLDKWVELRV^TEVKDSNQACLPP 

HQIYVDGNTQALLDTELPGGDKADASLLDPRVGI 
RSVCVSPNGQHLASGDRMGTLRVHELQSLSEML 
KVEAHDSEILCLEYSKPDTGLKLLASASRDRLIH 
VLDAGREYSLQQTLDEHSSSITAVKFAASDGQVR 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding' 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=G)utamic Acid, ^Phenylalanine, G=Glycine, H=His tiding 
I-Isoleucinc, K^Lysine, L=Le urine, M=Mcthionine T 
N=Asparagine, P=Proline, Q=Glutamine, R=Argininc, S^Serine, 
T^Threonine, V^Valine, W*=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possibIe nucleotide insertion 










MISCGADKSIYFRTAQKSGDGVQFTRTHHVVRK 

TTLYDMDVEPSWKYTAIGCQDRNIRIFNISSGKQ 

KKLFKGSQGEDGTLIKVQTDPSGIYIATSCSDKNL 

SIFDFSSGECVATMFGHSEIVTGMKFSNDCKHLIS 

VSGDSCIFVWRLSSEMTISMRQRLAELRQRQRGG 

KQQGPSSPQRASGPNRHQAPSMLSPGPALSSDSD 

KEGEDEGTEEELPALPVLAKSTKKALASVPSPAL 

PRSLSHWEMSRAQESVGFLDPAPAANPGPRRRG - 

RWVQPGVELSVRSMLDLRQLETLAPSLQDPSQD 

SLAIIPSGPRKHGQEALETSLTSQNEKPPRPQASQ 

PCSYPHIIRLLSQEEGVFAQDLEPAPIEDGIVYPEP 

SDOTTMDTSEFQVQAPARGTLGRVYPGSRSSEK 

HSPDSACSVDYSSSCLSSPEHPTEDSESTEPLSVD 

GISSDLEEPAEGDEEEEEEEGGMGPYGLQEGSPQ 

TPDQEQFLKQHFETLASGAAPGAPVQVPERSESR 

SISSRFLLQVQTRPLREPSPSSSSLALMSRPAQVPQ 

ASGEQPRGNGANPPGAPPEVEPSSGNPSPQQAAS 

VLLPRCRLNPDSSWAPKRVATASPFSGLQKAQS 

VHSLVPQERHEASLQAPSPGALLSREIEAQDGLG 

SLPPADGRPSRPHSYQNPTTSSMAKISRSIS VGEN 

LGLVAEPQAHAPIRVSPLSKLALPSRAHLVLDIPK 

PLPDRPTLAAFSPVTKGRAPGEAEKPGFPVGLGK 

AHSTTERWACLGEGTTPKPRTECQAHPGPSSPCA 

KlKlLr V boLr QCjrbN LQPrrrbKTPN rMECTKPGA 

ALSQDSEPAVSLEQCEQLVAELRGSVRQAVRLY 

HSVAGCKMPSAEQSRIAQLLRDTFSSVRQELEAV 

AGAVLSSPGSSPGAVGAEQTQALLEQYSELLLRA 

VERRMERKL . 


3600 


A 


1688 


916 


IPGST1SCSMALCEAAGCGSALLWPRLLLFGDSIT 
QFSFQQGG WGASLADRLVRKCDVLNRGFSGYN 
TRWAK3ILPRLIRKGNSLDIPVAVTIFFGANDSAL 
jWhNr&Klril r Lbh Y AANJLKSM V Q YLKS VDIPENR 

VILITPTPLCETAWEEQCIIQGCKLNRLNSVVGEY 

AM APT r\\f & fWtf^nTTwn x\t \\fn \/(r\ i r\Qr\v\x?<^Q\rT 
AJNA^-Lrl^ V Av£LK-Aj 1 D V L>L)L> W 1 LM^Dbv^Dr bo Y L 

SDGLHLSPKGNEFLFSHLWPLIEKKVSSLPLLLPY 
WRDVAEAKPELSLLGDGDH 


3601 


A 


44 


223 


VHFPLIPQLAKCFWTMNRAARMCSEKRYYSEFL 
Q1AHLFNYGLSSFLREFHFLIKLLQ 


3602 


A 


37. 


1124 


VPKPASGKRRLEFRPQDSKACAATPHSPGRITSR 

TRGSQKVRSVPPRLPWAQASASTDWEGLRGVPG 

PALRRENFLEAAASGRSGRTPTGGVGFRDVGGP 

HFPIFPAAHFLWCNLHTPRRPACNAPWHSPVGE1 

SPPPPJESQLRRDPEVHFESPAHPLGFRLLPGRGLP 

ANAVTVETAAMAAPRQIPSHIVRLKPSCSTDSSF . 

oLAoKiiLr V SS WQ VI EFSSKNLWEQI 
CKEYEAEQPPFPEGYKVKQEPVITVAPVEEMLFH 
GFSAEHYFPVSHFTMISRTPCPQDKSETINPKTCS 

QMTPSGGKACVWGHLPSSSHTI 


3603- 


A 


286 


587 


NISNKAEVSSHPSV1SHSMDSFGQPRPEDNQSVLR 

RMOKKYWKTKOVFIKATGKKEDEHLVASDAEL 

DAKLEVFHSVQETCTELLKIIEKYQLRLNGMKS 


3604 


A 


103 


2440 


QPRRRVFPAAGRGPGRKCSQWGRQASVSFEDVT . 

VDFSKEEWQHLDPAQRRLYWDVTLENYSHLLS 

VGYQPKSEAAFKLEQGEGPWMLEGEAPHQSCS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenyIalanine, G^GIycine, H=Histidine, 
I=Isol cu ci n c, K=Ly si ne, L=Leu ci n e, M=Meth ioni ne, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S^erine, 
T=Threoninc, V=Valine, W=Tryptophan, Y*=Tyrosinc, 
XMUnknown, *=Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 










GEAIGKMQQQGIPGGIFFHCERFDQPIGEDSLCSI 

LEELWQDNDQLEQRQENQNNLLSHVKVLIKERG 

YEHKNTEKIIHVTTKLWSIKRLFINCDTILKH 

SHNHNRNSATKNLGKIFGNGNNFPHSPSSTKNEN 

AKTGANSCEHDHYEKHLSHKQAPraHQKIHPEE 

KLYVCTECVMGFTQKSHLFEHQRIHAGEKSREC 

DKSNKVFPQKPQVDVHPSVYTGEKPYLCTQCGK 

VFTLKSNLITHQKIHTGQKPYKCSECGKAFFQRS 

DLFRHLRIHTGEKPYECSECGKGFSQNSDLSIHQ 

KTHTGEKHYECNECGKAFTRKSALRMHQRIHTG 

EKPYVCADCGKAFIQKSHFNTHQR1HTGEKP YEC 

SDCGKSFTKKSQLHVHQRIHTGEKPY1CTECGKV 

FTHRTNLTTHQKTHTGEKPYMCAECGKAFTDQS 

NLIKHQKTHTGEKPYKCN GCGKAFIWKSRLKIH 

QKSHIGERHYECKDCGKAFIQKSTLSVHQRIHTG 

EKPYVCPECGKAHQKSHFIAHHRIHTGEKPYECS 

DLOKCrTJ^bQl^KVHQKlH 1 (jrbKPNICAECGKAr 

TORSNLITHQKIHTREKPYECGDCGKTFTWKSRL 

NIHQKSHTGERHYECSKCGKAFIQKATLSMHQII 

HTGKKPYACTECQKAFTDRSNLIKHQKMHSGEK. 


3605 


A 


3 : 


322 ... 


SFRMSGRGKGGKGLGKGGAKRHRKVLRDNIQGI 
TKPAIRRLARRGGVKR1SGLIYEETRGVLKVFLEN 
VIRDAVTYTEHAKRKTVTAMDWYALKRQGRT 
LYGFGG 


3606 


A 


1 


1749 


VPVTAEAKLMGFTQGCVTFEDVAIYFSQEEWGL . 

LDEAQRLLYRDVMLENFALITALVCWHGMEDE 

ETPEQSVSVEGVPQVRTPEASPSTQKIQSCDMCV 

PFLTDILHLTDLPGQELYLTGACAVFHQDQKHHS 

AEKPLESDMDKASFVQCCLFHESGMPFTSSEVG 

KDFLAPLGILQPQAIANYEKPNKISKCEEAFHVGI 

SHYKWSQCRRESSHKHTFFHPRVCTGKRLYESS 

KCGKACCCECSLVQLQRVHPGERPYECSECGKS 

FSQTSHLNDHRRIHTGERPYVCGQCGKSFSQRAT 

LIKHHRVHTGERPYECGECGKSFSQSSNLIEHCRI 

HTGERPYECDECGKAFGSKSTLVRHQRTHTGEK 

PYECGECGKLFRQSFSLVVHQRIHTTARPYECGQ 

CGKSFSLKCGLIQHQLIHSGARPFECDECGKSFSQ 

K 1 i LNKHrlKVH 1 AbKPY VCOtCuKAFMr KSKJL 

VRHQRTHTGERPFECSECGKFFRQSYTLVEHQKI 

HTGLRPYDCGQCGKSFIQKSSLIQHQ WHTGERP 

GKPYSPRSNTV 


jOU / 


A 




OJ 1 


AMAUr *jrr vjr uJJr JJtiV^ Y Ur Lr JSJ_# V JL» V kjD Ao V vjJs. 1 

CWQRI^TGAFSERQGSTIGVDFTMKTLEIQGKR 
VKLQIWDTAGQER 


3608 


A 


545 


379 


AIKGYIHLSAPRNRYMHTTASNGRMLFMKVTM 
YMRRGVQIMGWSVRMAFMACFTQ 


3609 


A 


118 


873 


VWMAWQVSLLELEDRLQCPICLEVFKESLMLQC 
GHSYCKGCLVSLSYHLDTKVRCPMCWQWDGS 

^enr PKFV^T AWVTPAT PT PnnPPPKVr r OTTPTP>JPT 

LFCEKDQELICGLCGLLGSHQHHPVTPVSTVCSR 
MKEELAALFSELKQEQKKVDELIAKLVKNRTRIV 
NESDWSWVIRREFQELRHPVDEEKARCLEGIGG 
HTRGLVASLDMQLEQAQGTRERLAQAECVLEQF 
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SEQU) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C^Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G^Glycine, H=Histidine, 
I=Isolcucine, KHLysinc, L=Leucinc, M=Methionine, 
N=Asparagine, P=Proline, Q=G]utaroine, R=Arginine, S^Scrine, 

* 1111M/1HUC) r Tuiiiit) wt ± 1 jr |jiU|jii*iii, z 1. y r usnic, 

X=Un known, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










GNEDHHEFIWKFHSMASR 


3610 


A 


2 


987 


DPRVRPPLLQPPPPLLPRLVILKMAPLDLDKYVEI 

ARLCKYLPENDLKRLCDYVCDLLLEESNVQPVS 

TPVTVCGDHGQFYDLCELFRTGGQVPDTNYIFM 

GDFVDRGYYSLETFTYLLALKAKWPDRITLLRG 

NHESRQITQVYGFYDECQTKYGNANAWRYCTK 

VrUMJL I VAAJLiJjbCjlLUVHOOJLSPDiKTLDQIRTI 

ERNQEIPHKGAFCDLVWSDPEDVDTWAISPRGA \ 

GWLFGAKVTNEFVHINNLKLICRAHQLVHEGYK 

FMFDEKLVTVWSAPNYCYRCGNMSIMVFKDVN 

TREPKLFRAVPDSERVIPPRTTTPYFL 


3611 


A 


2459 . 


869 


AEKMTAELREAMALAPWGPVKVKKEEEEEENF 

PGQASSQQVHSENIKVWAPVQGLQTGLDGSEEE 

EKGQNISWDMAWLKATQEAPAASTLGSYSLPG 

TLAKSEILETHGTMNFLGAETKNLQLLVPKTEIC 

EEAEKPLIISERIQKADPQGPELGEACEKGNMLK 

RQRIKREKKDFRQVIVNDCHLPESFKEEENQKCK 

KSGGKYSLNSGAVKNPKTQLGQKPFTCSVCGKG 

FSQSANLVVHQRIHTGEICPFECHECGKAFIQSAN 

LVVHQRIHTGQKPYVCSKCGKAFTQSSNLTVHQ 

KIHSLEKTFKCNECEKAFSYSSQLARHQKVHITE 

KCYECNECGKTFTRSSNLIVHQRJHTGEKPFACN 

DCGKAFTQSANLIVHQRSHTGEKPYECKECGKA 

FSCFSHLIVHQRIHTAEKPYDCSECGKAFSQLSCL 

IVHQRIHSGDLPYVCNECGKAFTCSSYLLIHQRIH 

NGEKPYTCNECGKAFRQRSSLTVHQRTHTGEKP 

YECEKCGAAFISNSHLMRHHRTHLVE 


3612 


A 


318 


2245 


SPMAEAAL WTPQIPMVTEEFVKPSQGHVTFEDI 

AVYFSQEEWGLLDEAQRCLYHDVMLENFSLMA 

SVGCLHGIEAEEAPSEQTLSAQGVSQARTPKLGP 

SIPNAHSCEMCILVMBGDILYLSEHQGTLPWQKPY 

TSVASGKWFSFGSNLQQHQNQDSGEKHIRKEESS 

ALLLNSCKIPLSDNLFPCKDVEKX)FPTILGLLQHQ 

TTHSRQEYAHRSRETFQQRRYKCEQVFNEKVHV 

TEHQRVHTGEKAYKRREYGKSLNSKYLFVEHQR 

THNAEKPYVCNICGKSFLHKQTLVGHQQRIHTRE 

RSYVCIECGKSLSSKYSLVEHQRTHNGEKPYVCN 

VCGKSFRHKQTFVGHQQRIHTGERPYVCMECGK 

SF1HSYDRIRHQRVHTGEGAYQCSECGKSFIYKQ 

SLLDHHRIHTGERPYECKECGKAFIHKKRLLEHQ 

RIHTGEKPYVCnCGKSFIRSSDYMRHQRIHTGER 

AYECSDCGKAFISKQTLLKHHKIHTRERPYECSE 

CuJvor Y Lb V JU^LQHv^RIHTREQLCECNECGKVF 

SHQKRLLEHQKVHTGEKPCECSECGKCFRHRTS 

LIQHQKVHSGERPYNCTACEKAFIYKNKLVEHQ 

RIHTGEKPYECGKCGKAFNKRYSLVRHQKVHIT 

EEP 


3613 


A 


817 


3345 


NQSHPDSETVTVEGGRRKMKSNQERSNECLPPK 
KREIPATSRSSEEKAPTLPSDNHRVEGTAWLPGN 
PGGRGHGGGRHGPAGTSVELGLQQGIGLHKALS 

VQYAHLPHTFQFIGSSQYSGTYASFIPSQLIPPTAN 
PVTS A V AS AAGATTPSQRSQLEA YSTLLANMG S 
LSQTPGHKAEQQQQQQQQQQQQQQQQQQQQQ 
QQQHQQQQQQQQQQQQQQHLSRAPGLITPGSPP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCyste ine, D=Aspartic Acid, 
E^Glutamic Acid, ^Phenylalanine, G^Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, I>Leucine, M«=Methionine, 
N-Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T^Threonine, V=Valine, W«Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codoo, A=possible nucleotide deletion, 
\=possible nucleotide insertion 










PAQQNQYVHISSSPQNTGRTASPPAIPVHLHPHQ 

TMIPHTLTLGPPSQVVMQYADSGSHFVPREATK 

KAESSRLQQAIQAKEVLNGEMEKSRRYGAPSSA 

DLGLGKAGGKSVPHPYESRHVVVHPSPSDYSSR 

DPSGVRASVMVLPNSNTPAADLEVQQATHREAS 

PSTLNDKSGLHLGKPGHRSYALSPHTVIQTTHSA 

SEPLPVGLPATAFYAGTQPPVIGYLSGQQQAITY 

AGSLPQHLVIPGTQPLLIPVGSTDMEASGAAPAIV 

TSSPQFAAVPHTFVTTALPKSENFNPEALVTQAA 

YPAMV QAQIHLP WQS VASPAAAPPTLPPYFMK 

GSnQLANGELKKVEDLKTEDFIQSAEISNDLKIDS 

STVERJDEDSHSPGVAVIQFAVGEHRAQVSVEVLV 

bYPFFVFGQGWSSCCPERTSQLFDLPCSKLSVGD 

VCISLTLKNLKNGSVKKGQPVDPASVLLKHSKA 

DkjL A u £>KriK Y AbQEN G1NQG S A QMLSENGELKF 

PEKMGLSAAPFLTKIEPSKPAATRKRRWSAPESR 

KLEKSEDEPPLTLPKPSLIPQEVKICIEGRSNVGK 


3614 


A 


3 


114 


FFESRLRCKCCEPRGSWARFGCWRLQPEFKPKQ 
LEG 


3615 


A 


3 . 


1603 

i '•. 


DAWALTNQFSDSKQHIEVLKESLTAKEQRAAILQ 

TEVDALRLRLEEKETMLNKKTKQIQDMAEEKGT 

QAGEIHDLKDMLDVKERKVNVLQKKIENLQEQL 

RDKEKQMSSLKERVKSLQADTTNTDTALTTLEE 

ALAEKERTIERLKEQRDRDEREKQEEIDN^'XKDL 

KDLKEKVSLLQGDLSEKEASLLDLKEHASSLASS 

GLKKDSRLKTLEIALEQKKEECLKM5SQLKKAH 

EAALEARASPEMSDRIQHLEREITRYKDESSKAQ 

AEVDRLLEILKEVENEKNDKDKKIAELESLTSRQ 

VKDQNKK V ANLKHKEQ VEBCJCKS AQMLEEARRR 

EDNLNDSSQQLQDSLRKKDDRIEELEEALRESVQ : 

1TAEREMVLAQEESARTNAEKQVEELLMAMEKV 

KQELESMKAKLSSTQQSLAEKETHLTOLRAERR 

KHLEEVLEMKQEALLAAISEKDANIALLELSSSK 

KKTQEE VA ALKREKDRLVQQLKQQTQNRMKLM 

ADNYEDDHFKSSHSNQTNHKPSPDQDEEEGIWA 


3616 


A 


244 


1420 


RRRWRARGGLVPTLAWAEATGAYVPGRDKPDL 

PTWKRNFRSALNRKEGLRLAEDRSKDPHDPHKI 

YEFVNSGVGDFSQPDTSPDTNGGGSTSDTQEDEL 

DELLGNMVLAPLPDPGPPSLAVAPEPCPQPLRSPS 

LDNPTPFPNLGPSENPLKRLLVPGEEWEFEVTAF 

YRGRQVFQQTISCPEGLRLVGSEVGDRTLPGWP 

VTLPDPGMSLTDRGVMSYVRHVLSCLGGGLAL 

WRAGQWLWAQRLGHCHTYWAVSEELLPNSGH 

GPDGEVPKDKEGGVFDLGPFIVGSLGPPDLITFTE 

GSGRSPRYALWFCVGESWPQDQPWTKRLVMVK 

VVPTCLRALVEMARVGGASSLENTVDLfflSNSHP 

LSLTSDQYKAYLQDLVEGMDFQGPGES 


3617 


A 


852 


304 


RGGLLSKMARVLKAAAANAVGLFSRLQAPIPTV 

KAbo 1 b^rJLlJt^ V 1 ubv WNLGKLNHVAIAVPDLE 

KAAAFYKNILGAQVSEAVPLPEHGVSVVFVNLG 

NTKMELLHPLGRDSPIAGFLQKNKAGGM 

VDNINAAVMDLKKI<XIRSLSEEVKIGAHGKPVIF 

LHPKDCGGVLVELEQA 


3618 


A 


3 


5992 


DNIDETYGVNVQFESDEEEGDEDVYGEVREEAS 
DDDMEGDEAWRCTLSANMYVDEILVWCASEL 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
' corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H«Histidine, 
I=IsoIeucinc, K=Lysine, I^Leucine, M^Methionine, 
N=Asparagine, P=Proline, Q=Glutaminc, R^Arginine, S=Serine, 
T«Threonine, V=Valine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
possible nucleotide insertion 










NIPEFFPLESPHKKVGYGLSSRTWLQGGGKVIEA 
GRDLLVASGELMSSKKKDLHPRDIDAFWLQRQL 
SRFYDDAIVSQKKADEVLEILKTASDDRECENQL 
VLLLGFNTFDFIKVLRQHRMMILYCTLLASAQSE 
AEKERIMGKMEADPELSKFLYQLHETEKEDLIRE 
ERSRRERVRQSRMDTDLETMDLDQGGEALAPRQ 
VLDLEDLVFTQGSHFMANKRCQLPDGSFRRQRK 
GYEEVHVPALKPKPFGSEEQLLPVEKLPKYAQA 
GFEGFKTLNRIQSKLYRAALETDENLLLCAPTGA 
GKTNVALMCMLREIGKHINMDGTINVDDFK1IY1 
APMRSLVQEMVGSFGICRLATYGITVAELTGDHQ 
LCKEEISATQIIVCTPEKWDIITRKGGERTYTQLV 
RLIILDEIHLLHDDRGPVLEALVARAIRNIEMTQE 
DVRLIGLSATLPNYEDVATFLRVDPAKGLFYFDN 
SFRPVPLEQTYVGITEKKAIKRFQIMNEIVYEKIM 
EHAGKNQVLVFVHSRKETGKTARAIRDMCLEKD 
TLGLFLREGSASTEVLRTEAEQCKNLELKDLLPY 
GFAIHHAGMTRVDRTLVEDLFGDICHIQVLVSTA 
TLAWGVNLPAHTVIIKGTQVYSPEKGRWTELGA 
LDILQMLGRAGRPQYDTKGEGILITSHGELQYYL 
SLLNQQLPIESQMVSKLPDMLNAEIVLGNVQNA 
KDAVNWLGYAYLYIRMLRSPTLYGISHDDLKGD 
PLLDQRRLDLVHTAALMLDKNNL VKYDKKf GN 
FQVTELGRIASHY^qTNDTVQTYNQLLKPTLSEIE 
LFRVFSLSSEFKMTVREEEKLELQKLLERVPIPVK 
ESIEEPSAKINVLLQAFISQLKLEGFALMADMVY 
VTQSAGRLMRAIFEIVLNRGWAQLTDKTLNLCK 
ME)KRMWQSMCPLRQFRKLPEEVVKKIEKKNFP 
FERLYDLNHNEIGELimPKMGKTIHKYVHLFPK 
LELSVHLQPITRSTLKVELTITPDFQ WDEICVHGSS 
EAFWBLVEDVDSEVILHHEYFLLKAKYAQDEHLI 
TFFVPVFEPLPPQYFIRVVSDRWLSCETQLPVSFR 
HLILPEKYPPPTELLDLQPLPVSALRNSAFESLYQ 
DKFPFFNPIQTQVFNTVYNSDDNVFVGAPTGSGK 
TICAEFAILRMLLQNSEGRCVYITPMRLWQEQVY 
MDWYEKFQDRLNKKVVLLTGETSTDLKLLGKG 
MnSTPEKWDJLSRRWKQRKNVQNINLFVVDEV 
: HLIGGENGPVLEVICSRMRYISSQIERPIRIVALSSS 
LSNAKDVAHWLGCSATSTFNFHPNVRPVPLELHI 
QGFNISHTQTRLLSMAKPVFHAITKHSPKKPYTVF 
VPSRKQTRLTAIDILTTCAADIQRQRFLHCTEKDL 
IPYLEKLSDSTLKETLLNGVGYLHEGLSPMERRL 
VEQLFSSGAIQVVVASRSLCWGMNVAAHLVIIM 
DTLYYNGIOHAYVDYPIYDVLQMVGHANRPLQ 
DDEGRC VIMCQG SKKDFFKKFLYEPLP VESHLD 
HCMHDHFNAEIVTKTIENKQDAVDYLTWTFLYR 
RMTQNPNYYNLQG1SHRHLSDHLSELVEQTLSDL 
EQSKCISIEDEMDVAPLM.GMIAAYYYINYTTIEL 
FSMSLNAKTKVRGLIEnSNAAEYENlPIRHHEDN 
LLRQLAQKWHKLNNPKFNDPHVKTNLLLQAHL 

WLSPALAAMELAQMVTQAMWSEDSYLRRLPPF 
PSGLFKRCTDKGVESVFD1MEMEDEERNALLQLT 
DSQIADVARFCNRYPNIELSYEWDKDSIRSGGP 
VVVLVQLEREEEVTGPVIAPLFPQKREEGWWVV 
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SEQW 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
' corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Acid, 
E-Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
Msoleucine, K=Lysine, t^Leucine, MHVfethionine, 
N^Asparagine, P=ProIine, Q=Glutatnine, R=Arginine, S=Serine, 
T=Threonine, V«=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










IGDAKSNSLISIKRLTLQQKAKVKLDFVAPATGG 
RHNTLYFMSDAYMGCDQEYKFSVDVKEAETDS 
DSD 


3619 


A 


3 


5992 


DNIDETYGVNVQFESDEEEGDEDVYGEVREEAS 
DDDMEGDEAVVRCTLSANMYVDEILVWCASEL . 
NIPEFFPLESPHKKVGYGLSSRTWLQGGGKVIEA 
GRDLLVASGELMSSKKKDLHPRDIDAFWLQRQL 
SRFYDDATVSQKKADEVLEILKTASDDRECENQL 
VLLLGFNTFDFIKVLRQHRMMILYCTLLASAQSE 
AEKEREvlGKMEADPELSKFLYQLHETEKEDLIRE 
ERSRRERVRQSRMDTDLETMDLDQGGEALAPRQ 
VLDLEDLVFTQGSHF3V1ANKRCQLPDGSFRRQRK 
GYEEVHVPALKPKPFGSEEQLLPVEKLPKYAQA 
GFEGFKTLNR1QSKLYRAALETDENLLLCAPTGA 
GKTNVALMCMLREIGKHINMDGT1NVDDFKIIYI 
APMRSLVQEMVGSFGKRLATYGITVAELTGDHQ 
LCKEEISATQUVCTPEKWDIITRKGGERTYTQLV 
RLIILDEIHLLHDDRGPVLEALVARAIRNIEMTQE 
DVRLIGLSATLPNYEDVATFLRVDPAKGLFYFDN 
SFRPWLEQTYVGITEKKAIKRFQIMNEIVYEKIM 
EHAGKNQVLVFVHSRKETGKTARAIRDMCLEKD 
TLGLFLREGSASTEVLRTEAEQCKNLELKDLLPY 
GFAIHHAGMTRVDRTLVEDLFGDKHIQVLVSTA 
TLAWGVNLPAHTVIIKGTQVYSPEKGRWTELGA 
LDILQMLGRAGRPQYDTKGEGILITSHGELQYYL 
SLLNQQLPffiSQMVSKLPDMLNAEIVLGNVQNA 
KDAVNWLGYAYLYIRMLRSPTLYGISHDDLKGD 
PLLDQRRLDLVHTAALMLDKNNLVKYDKKTGN 
FQVTELGRMSHYYITODTVQTYNQLLKPTLSEIE 
LFRVFSLSSEFKNITVREEEKLELQKLLERVPIPVK 
ESIEEPSAKINVLLQAFISQLKLEGFALMADMVY 
VTQSAGRLA1RAIFEIVLNRGWAQLTDKTLNLCK 
MTOKRMWQSMCPLRQFRKLPEEVVKKIEIO^ 
FERLYDLNHNEIGELIRMPKMGKTIHKYVHLFPK 
LELSVHLQPITRSTLKVELTITPDFQWDEKVHGSS 
EAFWILVEDVDSEVILHHEYFLLKAKYAQDEHLI 
TFFVPVFEPLPPQYFIRVVSDRWLSCETQLPVSFR 
HLILPEKYPPPTELLDLQPLPVSALRNSAFESLYQ 
DKFPFFNPIQTQVFNTVYNSDDNVFVGAPTGSGK 
HCAEFAILRMLLQNSEGRCVYITPMRLWQEQVY 
MDWYEKFQDRLNKKVVLLTGETSTDLKLLGKG 
MHSTPEKWDILSRRWKQRKNVQNINLFVVDEV 
HLIGGENGPVLEVICSRMRYISSQIERPIRIVALSSS 
LSNAKDVAHWLGCSATSTFNFHPNVRPVPLELffl 
QGFMSHTQTRLLSMAKPVFHAITKHSPKICPVIVF 
WSRKQTRLTAmiLTTCAADIQRQRFLHCTEKDL 
IPYLEKLSDSTLKETLLNGVGYLHEGLSPMERRL 
VEQLFSSGAIQWVASRSLCWGMNVAAHLVIIM 
. DTLYYNGKIHAYVDYPIYDVLQMVGHANRPLQ 
DDEGRCVIMCQGSKKDFFKKFLYEPLPVESHLD 

RMTQNPNYYNLQGISHRHLSDHLSELVEQTLSDL 
EQSKCISmDEMDVAPLlvnLGMlAAYYYINYTTIEL 
FSMSLNAKTKVRGLffillSNAAEYENPIRHHEDN 
LLRQLAQKVPHKLNNPKFNDPH\OCTNLLLQAPIL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A s Alanine OCysteine, D-Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=GIycine, HNHistidine, 
I=Isoleucine, KpLysine, I>Lcucine, WNMethionine, 
N^Asparagine, P«=ProIine, Q=Glutamine, R=Arginine, S^Serine, 
T=Thrconine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X«Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 










SRMQLSAELQSDTEEILSKAIRLIQACVDVLSSNG 
WLSPALAAMELAQMVTQAMWSEDSYLRKLPPF 
PSGLFKRCTOKGVESVFDIMEMEDEERNALLQLT 
. DSQ1ADVARFCNRYPNIELSYEVVDKDSIRSGGP 
VVVLVQLEREEEVTGPyiAPLFPQKREEGWWVV 
IGDAKSNSLISIKRLTLQQKAKVKLDFVAPATGG 
RHNTLYFMSDAYMGCDQEYKFSVDVKEAETDS 
DSD 


3620 


A; 


1205 


323 . 


VKMALAARLLPQFLHSRSLPCGAVRLRTPAVAE 
VRLPSATLCYFCRCRLGLGAALFPRSARALAASA 
LPAQGSRWPVLSSPGLPAAFASFPACPQRSYSTE 
. EK^QQHQKTKJvIIVLGFSNPINWVRTRlKAFLIWA 
YFDKEFSITEFSEGAKQAFAHVSKLLSQCKFDLL 
EELVAKEVLHALKEKVTSLPDNHKNALAANIDEI 
WTSTGDISIYYDEKGRKFVNILMCFWYLTSANIP 
SETLRGASVFQVKLGNQNVETKQLLSASYEFQR 
EFTQGVKPDWTIARIEHSKLLE 


3621 


A 


2 


2995 


SSSRSRHSSISPVRLPLNSSLGAELSRKKKERAAA 

AAAAKMDGKESSYERSGSYSGRSPSPYGRRRSSS 

PFLSKRSLSRSPLPSRKSMKSRSRSPAYSRHSSSH 

SKKKRSSSRSRHSSISPVRLPLNSSLGAELSRKKK 

ERAAAAAAAKMDGKESSYERSGSYSGRSPSPYG 

RRRSSSPFLSKRSLSRSPLPSRKSMKSRSRSPAYS 

RHSSSHSKKKRSSSRSRHSSISPVRLPLNSSLGAEL 

SRKKKERAAAAA AAKMDGKESKGSPVFLPRKE . 

N SSVEAKDSGLESKKLPRS VKLEKS APDTELVNV 

THLNTEVKNSSDTGKVKLDENSEKHLVKDLKAQ 

GTRDSKPIALKEEIVTPKETETSEKETPPPLPTIASP 

PPPLPTTTPPPQTPPLPPLPPIPALPQQPPLPPSQPA 

FSQVPASSTSTLPPSTHSKTSAVSSQANSQPPVQV 

SVKTQVSVTAAIPHLKTSTLPPLPLPPLLPGDDDM 

DSPKETLPSKPVKKEKEQRTRHLLTDLPLPPELPG 

GDLSPPDSPEPKAITPPQQPYKKRPKICCPRYGER 

RQTESDWGKRCVDKFDIIGIIGEGTYGQVYKAKD 

KDTGELVALICKVRLDNEKEGFPITAIREIKILRQL 

IHRSVVNMKEIVTDKQDALDFKKDKGAFYLVFE 

YMDHDLMGLLESGLVHFSEDHIKSFMKQLMEGL 

EYCHKKNFLHRDIKCSNILLNNSGQIKLADFGLA 

RLYNSEESRPYThKVlTLWYRPPKLLLGEERYTP 

AIDVWSCGCELGELFTKKPIFQANLELAQLELISR 

LCGSPCPAVWPDVIKLPYFNTMKPKKQYRRRLR 

EEFSFIPSAALDLLDHMLTLDPSKRCTAEQtLQSD 

FLKDVELSKMAPPDLPHWQDCHELWSKKRRRQ 

RQSGVVVEEPPPSKTSRKETTSGTSTEPVKNSSPA 

PPQPAPGKVESGAGDAIGLADITQQLNQSELAVL 

LNLLQSQTDLSIPQMAQLLNIHSNPEMQQQLEAL 

NQSISALTEATSQQQDSETMAPEESLKEAPSAPVI 

LPSAEQTTLEASSTPADMQNILAVLLSQLMKTQE 

PAGSLEENNSDKNSGPQGPRRTPTMPQEEAAGRS 

NGGNAL 


3622 


A 


1 £ 

ID 


jy\) 


lrJDlvuoA i rn 1 AA VKKJrAvjiiv^rl 1 MoJL>lJiAJsX,o 1 

EHLGDKIKDEDKLRVIGQDSSEIHFKVK2vrm>LK 
KLKKSYCQRQGVPVNSLRFLFEGQRIADNHTPEE 
LGMEEEDVIEVYQEQIGGHSTV 


3623 


A 


2 


1544 


PPPAPGPDGLNEGCLHRLSMPHQRPRTCAMNPE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=»Alanine C=Cysteine, D=Aspartic Acid, 
£=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidinc, 
I«Isoleucine, K=Lysine, L^Leucine, M^Mcthionine, 
N=Asparagine, P=ProIine, Q^GIu famine, R=Argininc, S^Serine, 
T=Threonine, V=Vaiine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop cod on ,/= possible nucleotide deletion, 
V=possible nucleotide insertion 










LTMESLGTLHGARGGGSGGGGGGGGGGGGGGP 

GHEQELLASPSPHHARRGPRGSLRGPPPPPTAHQ 

ELGTAAAAAAAASRSAMVTSMASILDGGDYRPE 

LSIPLHHAMSMSCDSSPPGMGMSNTYTTLTPLQP 

LPPISTVSDKFHHPHPHHHPHHHHHHHHQRLSGN 

VSGSFTLMRDERGLPAMNNLYSPYKEMPGMSQS 

LSPLAATPLGNGLGGLHNAQQSLPNYGPPGHDK 

MLSPNFDAHHTAMLTRGEQHLSRGLGTPPAAM 

MSHLNGLHHPGHTQSHGPVLAPSRERPPSSSSGS 

QVATSGQLEEINTKEVAQRJTAELKRYSIPQAIFA 

QRVLCRSQGTLSDLLRNPKPWSKLKSGRETFRR 

MWK WLQEPEFQRMS A LRLAA CKRKEQEPNKDR 

NNSQKKSRLWTDLQRRTLFAIFKENKRPSKEMQ 

ITISQQLGLELTTVSNFFMNARRRSLEKWQDDLS 

TGGSSSTSSTCTKA 


3624 


A 


27 


2152 


SARKAEAATSGTAARDGSVGRNLVPPPSASAPK 

AEVESNEKDNRPEEEEQVIHEDDERPSEKNEFSR 

RKRSKSEDMDNVQSKRRRYMEEEYEAEFQVKIT 

AKGDINQKLQKVIQWLLEEKLCALQCAVFDKTL 

AELKTRVEKIECNK^HKWLTELQAKIAJRLTKRF 

EAAKEDLKKRHEHPPNPPVSPGKTVNDVNSNNN 

MSYRNAGTVRQMLESICRNVSESAPPSFQTPVNT 

VSSTNLVTPPAVVSSQPKLQTPVTSGSLTATSVLP 

APNTATVVATTQVPSGNPQPTISLQPLPVILHVPV 

AVSSQPQLLQSHPGTLVTNQPSGNVEFISVQSPPT 

VSGLTKNPVSLPSLPNPTKPNNVPSVPSPSIQRNP 

TASAAPLGTTLAVQAVPTAHSIVQATRTSLPTVG 

PSGLYSPSTNRGPIQMKIPISAFSTSSAAEQNSNTT 

PRIENQTNKTIDAS VSKKAADSTSQCGKATGSDS 

SGVIDLTMDDEESGASQDPKKLNHTPVSTMSSSQ 

PVSRPLQPIQPAPPLQPSGVPTSGPSQTTIHLLPTA. 

PTTVNVTHRPVTQVTTRLPWRAPANHQVVYTT 

LPAPPAQAPLRGTVMQAPAVRQVNPQNSVTVRV 

PQTTTWVNNGLTLGSTGPQLTVHHRPPQVHTEP 

PRPVHPAPLPEAPQPQRLPPEAGSTSRPSEATLEV 

SHAFRVKMAIVLVMECPGGGSKLCHC 


3625 


A 


210 


1115 


ASPFLRPQGHDSGEREPFSQTPGLMQPFSIPVQIT 

LQGSRRRQGRTAFPASGKKRETDYSDGDPLDVH 

KRLPSSTGEDRAVMLGFAMMGFSVLMFFLLGTT 

ILKPFMLSIQREESTCTAIHTDIMDD WLDCAFTCG 

VHCHGQGKYPCLQVFVNLSHPGQKALLHYNEE 

AVQINPKCFYTPKCHQDRNDLLNSALDEKEFFDH 

KNGTPFSCFYSPASQSEDVILIKKYDQMAIFHCLF 

WPSLTLLGGALIVGMVRLTQHLSLLCEKYSTVV 

RDEVGGKVPYIEQHQFKLCIMRRSKGRAEKS 


3626 


A 


9 


921 


SSVVEFSALSVSMACLSPSQLQKFQQDGFLVLEG 
FLSAEECVAMQQRIGEIVAEMDVPLHCRTEFSTQ 
EEEQLRAQGSTDYFLSSGDKIRFFFEKGVFDEKG 
NFLVPPEKSINKIGHALHAHDPVFKSITHSFKVQT 
LARSLGLQMPVVVQSMYIFKQPHFGGEVSPHQD 

SHTSGVSRRMVRAPVGSAPGTSFLGSEPARDNSL 

FVPTPVQRGALVLIHGEVVHKSKQNLSDRSRQA 

YTFHLMEASGTTWSPENWLQPTAELPFPQLYT 


3627 


A 


231 


644 


INSSPRTGRDHQEL>JLHTERDSRSQRAVLKIPRQ 
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SEQJDD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=*Aspartic Acid, 
E=Glutamic Acid, F=Phcnylalanine, G-Glycine, H=Histidine, 
I=IsoIeucine, K-Lysine, L=Leucine ( M=*Methionine, 
N=Asparagine, P^Proline, Q=Glutamine,R«Arginine, S^Serine, 
T^Threonine, V=Valine, W=Tryptophan, Y^Tyrosinc, 
X=Unknown, *«=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










OTGIFYWIFLPSRSHSASHGSRQRQVSCQGTQDEI 
LICMRNTFAELKNSLEALSSRMDQAEERJGTQAG 
VQWRDHGSLQPQPPEFKQCFHLSLPSSWDYRAC 
LS 


3628 


A 


2 


810 


GCKHLLQNSWYDPRVREADRVGQRARRPRAAM 

DWLMGKSKAKPNGKKPAAEERKAYLEPEHTKA 

R1TDFQFKELVVLPREIDLNEWLASNTTTFFHHTN 

LQYSTISEFCTGETCQTMAVCNTQYYWYDERGK 

KVKCTAPQYVDFVMSSVQKLVTDEDVFPTKYG 

REFPSSFESLVRKICRHLFHVLAHIYWAHFKETLA 

LELHGHLNTLYVHFILFAREFNLLDPKETAIMDD 

LTEVLCSGGRRGSTVGAVGMGPAAGAPGAQNH 

VKER 


3629 


A 


699 


1604 


CSHGSSAVSAWSPLFQASEVERQLSMQVHALRE 

DFREKNSSTNQHIIRLESLQAEIKMLSDRXRELEH 

RLSATLEENDLLQGTVEELQDRVLILERQGHDKD 

LQLHQSQLELQEVRLSCRQLQ VKVEELTEERSLQ 

SSAATSTSLLSEIEQSMEAEELEQEREQLTLLSVE 

MTALKEERDRLRVTSEDKEPKEQLQKAIRDRDE 

AIAKKNAVELELAKCRMDMMSLNSQLLDAIQQ 

KLNLSQQLEAWQDDMHRVIDRQLMDTHLKERS 

QPAAALCRGHSAGRGDEPSIAEGKRLFSFFRKI 


3630 


A 


423 


1 


PAKVLTLDIYLSKTEGAQVDEPVVITPRAEDCGD 

WDDMEKRSSGRRSGRRRGSQKSTOSPGADAELP 

ESAARDDAVFDDEVAPNAASDNASAEKKVKSPR 

AALDGGVASAASPESKPSPGTKGQLRGESDRSK 

QPPPASSP 


3631 


A 


2082 


674 


WSGFWQLPGVRGVGSAPGGDGAEFTSRRGSSRR 

PGAACPGCRGAGSERAPGGMGRRRAPELYRAPF 

PLYALQVDPSTGLLIAAGGGGAAKTGIKNGVHF 

LQLELINGRLSASLLHSHDTETRATMNLALAGDI 

LAAGQDAHCQLLRFQAHQQQGNKAEKAGSKEQ 

GPRQRKGAAPAEKKCGAETQHEGLELRVENLQA 

VQTDFSSDPLQKVVCFNHDNTLLATGGTDGYVR 

VWKVPSLEKVLEFKAHEGEEEDLALGPDGKLVT 

VGRDLKASVWQKDQLVTQLHWQENGPTFSSTP 

YRYQACRFGQVPDQPAGLRLFTVQIPHKRLRQPP 

PCYLTAWDGSNFLPLRTKSCGHEVVSCLDVSES 

GTFLGLGTVTGSVAIYIAFSLQCLYYVREAHGIV • 

VTDVAFLPEKGRGPELLGSHETALFSVAVDSRCQ 

LHLLPSRRSVPVWLLLLLCVGLIIVTILLLQSAFPG 

FL 


3632 


A 


942 


40 


PWCQRVEVRSCGSSKRSCSRWSGSSWDGSRSLG 

RGLNHTSLNRSPPFTPDTMTHCCSPCCQPTCCRT 

TCCRTTCWKPTTVTTCSSTPCCQPSCCVPSCCQP 

CCHPTCCQNTCCRTTCCQPTCVASCCQPSCCSTP 

CCQPTCCGSSCCGQTSCGSSCCQPICGSSCCQPCC 

HPTCYQT1CFRTTCCQPTCCQPTCCRNTSCQPTCC 

GSSCCQPCCHPTCCQTICRSTCCQPSCVTRCCStP 

CCQPTCGGSSCCSQTCNESSYCLPCCRPTCCQTT 


3633 


A 


605 


3004 


GPEGYRGRRARHPSLGSTTGHCGGGRGAEGTGT 
DPAAPAARLNVDGLLVYFPYDYIYPEQFSYMRE 
LKRTLDAKGHGVLEMPSGTGKTVSLLALIMAYQ 
RAYPLEVTKLIYCSRTVPEIEKVIEELRKLLNFYE 
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SEQH) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D-Aspartic Acid, 
EKJlutamic Acid, F=PhenylaIanine, G=Glycinc ( H=Histidine, 
I=Isoleucine, K=Lysrne, L^Leucine, A ^Methionine, 
N=Asparagine, PHProline, Q=CJutamine, R^Arginine, S*=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Un known, * e =Stop codon, /possible nucleotide deletion, 
V=possible nucleotide insertion 










KQEGEKLPFLGLALSSRKNLCIHPEVTPLRFGKD 

VDGKCHSLTASYVRAQYQHDTSLPHCRFYEEFD 

AHGREVPLPAGIYNLDDLKALGRRQGWCPYFLA 

RYSILHANVVVYSYrT^LDPKIADLVSkELARK 

AVWTOEAHNIDNVCroSMSVNLTRRTLDRCQG 

NLETLQKTVLRIKETDEQRLRDEYRjRLVEGLREA 

SAARETDAHLANPVLPDEVLQEAVPGSIRTAEHF 

LGFLRRLLEYVKWRLRVQHWQESPPAFLSGLA 

QRVCIQRKPLRFCAERLRSLLHTLEITDLADFSPL 

TLLANFATLVSTYAKGFTniEPFDDRTPTIANPIL 

HFSCMDASLAIKPVFERFQSVnTSGTLSPLDIYPK 

ILDFHPVTMATFTMTLARVCLCPMIIGRGNDQVA 

ISSKFETREDIAVIRNYGNLLLEMSAVVPDGIVAF 

FTSYQYMESWASWYEQGILENIQRNKLLFIETQ 

DGAETSVALEKYQEACENGRGAILLSVARGKVS 

bulDr Vrlri i UKA VIMrG VrYV YTQSRILKARLEY 

LRDQFQIRENDFLTFDAMRHAAQCVGRAIRGKT 

DYGLMVFADKRFARGDKRGKLPRWIQEHLTDA 

INJLiNbl VJJiiOVV^VAlvirJLKyMAV^rrrtKJb 

SLLSLEQLESEETLKRIEQIAQQL 


3634 


A 


159 


384 


LKMSSKTASTNNIAQARRTVQQLRLEASIERIKV 
SKASADLMSYCEEHARSDPLLIGIPTSENPFKDKK 


3635 


A . 


5 


409 


TELSQLEKAHPPADMGRRKSKRKPPPKKKMTGT 
LETQFTCPFCKHEKSCDVKMDRARNTGVISCTV 
LLElirylrl 1 Cii^OWjLUr r^K VOKCjLboOrCobGP 
LCALVQGQSRPEEQVPPSDFCGVRRCRAGFQCQ 


3636 


A 


48 


282 ; 


DHLKSCYQDSHEDPTKMKRFLFLLLTISLLVMVQ 
IQTGLSGQNDTSQTSSPSASSSMSGGIFLFFVANAI 
IHLFCFS 


3637 


A 


1 


1248 


ARAGSVVGSAAARGPPAGCRCERAARLPSSPAR 

RRRCDWVEDGAGRMEILMTVSKFASICTMGAN 

ASALEKEIGPEQFPVNEHYFGLVNFGNTCYCNSV 

LQALYFCRPFREKGLA YKSQPRKKESLLTCLADL 

FHSIATQKKKVG VIPPKKFITRLRKENELFDN YM 

QQDAHEFLNYLLNTIADELQEERKQEKQNGRLPN 

GNIDNENNNSTPDPTWVHEEFQGTLTNETRCLTC 

ETISSKDEDFLDLSVDVEQNTSITHCLRGFSNTET 

bobb IKY x CbbCKbivQ bAHKJuvlK VKKLPMILAL 

HLKRFKYMDQLHRYTKLSYRVVFPLELRLFNTS 

GDATNPDRMYDLVAVVVHCGSGPNRGHYIAIV 

KSHDFWLLFDDDIVEKIDAQAIEEFYGLTSDISKN 

SESGYILFYQSRD 


3638 . 


A 


11 


630 


PAGIPVSTISSDRRASTDLTTIKMKPDETPMFDPNL 

LKEVDWSQNTATFSPAISPTHPGEGLVLRPLCTA 

UbJNKUr r K V LCjvJL I b i Cj V V br bQr MK.br bHMKK 

SGDYYVTWEDVTLGQIVATATLUEHKFIHSCAK 

RGRVEDVWSDECRGKQLGNLLLSTLTLLSICKL 

NCYKITLECLPQNVGFYKICFGYTVSEENYMCRR 

FLK 


3639 


A 


2 


1200 


PRVRLLRPSRSRSCRGT T STR APfiP9PFR<U W^PT 

LPHAMKSPFYRCQNTTSVEKGNSAVMGGVLFST 

GLLGNLLALGLLARSGLGWCSRRPLRPLPSVFY 

MLVCGLTVTDLLGKCLLSPWLAAYAQNRSLRV 

LAPALDNSLCQAFAFFMSFFGLSSTLQLLAMALE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Phenyla!anine, G=G)ycine, H^Histidine, 
I-Isolcucine, K=Lysinc, L*=Lcucinc, M=Mcthionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=*Serine, 
T«Threonine, V«Va1ine, W=Tryptophan, Y«=Tyrosine, 
X=Un known, *=Stop codon, /=possibIe nucleotide deletion, 
V^ssiblc nucleotide insertion 










CWLSLGHPFFYRRHITLRLGALVAPWSAFSLAF 
CALPFMGFGKFVQYCPGTWCFIQMVHEEGSLSV 

T fTVQVT VCCT "N/f A T T \71 A T\7T f^WX /Tl A X AT) XTT VAX/f • 
L-O I o VJL-YooL-MAJLLVL-A 1 VJA^INJLAJAJVLKNJLYAM 

HRRLQRHPRSCTRDCAEPRADGREASPQPLEELD . 
HLLLLALMTVLrTMCSLPVIYRAYYGAFKDVKE 
KNRTSEEAEDLRALRFLSVISIVDPWIFIIFRSPVFR 
IFFHKIFIRPLRYRSRCSNSTNMESSL 


3640 


A 


930 


182 


PLPPPTLAMFLTRSEYDRGVNTFSPEGRLFQVEY 
AEEAIKLGSTAIGIQTSEGVCLAVEKRITSPLMEPS 
SIEKIVEIDAH1GCAMSGLIADAKTLIDKARVETQ 
INtt Wr L i JNtlMl VJDbV I^AVoINJLAJLV^rObrilJADr 

GAMSRPFGVALLFGGVDEKGPQLFHMDPSGTFV 
QCDARAIGSASEGAQSSLQEVYHKSMTLKEAIKS 
SLIILKQVMEEKLNATlSnGELAIVQPGQNFHMFTK 
EELEEVIKDI 


3641 


A 


2 


1254 


PTGQGGRRAEARSCLLSKAMLGRSGYRALPLGD 

FDRFQQSSFGFLGSQKGCLSPERGGVGTGADVPQ 

SWPSCLCHGLISFLGFLLLLVTFPISGWFALKJVPT 

YERMIVFRLGRIRTPQGPGMVLLLPFIDSFQRVDL 

RTRAFNVPPCKLASKDGAVLSVGADVQFRIWDP 

VLSVMTVKDLNTATRMTAQNAMTKALLICRPLR 

EIQMEKLiaSDQLLLEINDVTRAWGLEVDRVELA 

VEAVLQPPQDSPAGPNLDSTLQQLALHFLGGSM 

XTCAA A A PCD/TO A T\T\ TT2\ JT\ T OX2\ 7T7TVD A T>01 ir^ ADO 

IN&MAuUAJ'or'UJrA.UI VJbMViilVJbrrAry VLrARi> 
SPKQPLAEGLLTALQPFLSEALVSQVGACYQFNV 
VLPSGTQSAYFLDLTTGRGRVGHGVPDGIPDVV 

VFMAPATYT P AT T PPT7T PPT OA VTVyfCnPT V\TV nr\ 

LAMAMKLEAVLRALK 


3642 


A 


. 1 ; 


237- . 


RRGEIDMATEGDVELELETETSGPERPPEKPRKH . 

DSGAADLERVTDYAEEKEIQSSNLETAMSVIGDR 

RSREQKAKQER 


3643 


A 

A 


Q4 




PTfEP PPPPPPK/TE A T T^rVAT tct o\/vt:tt 

TLSDLECDYINARSCCSKLNKWVIPELIGHTIVTV 
LLLMSLHWFIFLLNLPVATWNTYRYIMVPSGNM 
GVFDPTEIHNRGQLKSHMXEAMIKLGFHLLCFF 
MYLYSMILALIND 


3644 


A 


95 


2808 . 


TSCRHFPITSEDPLNYLLELTVERIYAYQALPLGFL 

FCSRDPVPEYLNHCGVKYVLISDRASFCALmFFS 

PFRNVFRPAAGGGIAPPPRLWFQPSLSDAEMEIPK 

LLPARGTLQGGGGGGIPAGGGRVHRGPDSPAGQ 

VPTRRLLLPRGPQDGGPGRRREEASTASRGPGPS 

LFAPRPHQPSGGGGGGGDDFFLVLLDPVGGDVE 

TAGSGQAAGPVLREEAEEGPGLQGGESGANPAG 

PTALGPRCLSAVPTPAPISAPGPAAAFAGTVTIHN 

QDLLLRFENGVLTLATPPPHAWEPGAAPAQQPG 

CLIAPQAGFPHAAHPGDCPELPPDLLLAEPAEPAP 

APAPEEEAEGPAAALGPRGPLGSGPGWLYLCPE 

ALCGQTFAKKHQLKMHLLTHSSSQGQRPFKCPL 

GGCGWTFTTSYKLKRHLQSHDKLRPFGCPAEGC 

GKSFTTVYTSILKAHMKGHEQENSFKCEVCEESFP 

TOAKLGAHORSHFEPERPYOCAFSGCKKTFITVS 

ALFSHNRAHFREQELFSCSFPGCSKQYDKACRLK 

IHLRSHTGERPFLCDFDGCGWNFTSMSKLLRHKR 

KHDDDRRFMCPVEGCGKSFTRAEHLKGHSITHL 

STKPFVCPVAGCCARFSARSSLYIHSKKHLQDVD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide ' 
sequence 


Amino acid sequence (A^Alanine OCysteine, D=Aspartic Acid, 
E*=Glutaniic Acid, F-Phenylalanine, G=G!ycine, H^Histidine, 
I-Isoleucine, K^Lysine, L^Leucine, M~Methionine, 
N=Asparagine, P=ProIine, Q=Glu(amine, R=Arginine, S^erine, 
T^Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, A=possible nucleotide deletion, 
\==possible nucleotide insertion 










TWKSRCPISSCNrO.FTSKHSMKTHMVKRHKVGQ 

DLLAQLEAANSLTPSSELTSQRQNDLSDAEIVSLF 

SDVPDSTSAALLDTALVNSGILTIDVASVSSTLAG 

HLPANNNNSVGQAVDPPSLMATSDPPQSLDTSLF 

FGTAATGFQQSSLNMDEVSSVSVGPLGSLDSLA 

MKNSSPEPQALTPSSKLTVDTDTLTPSSTLCENSV 

SELLTPAKAEWSVHPNSDFFGQEGETQFGFPNAA 

GNHGSQKERNLITVTGSSFLV 


1 C AC 

3645 


A 


2194 


1707 


TVSFHKTMASLKCSTWCVICLEKPKYRCPACRV 

PYCSVVCFRKHKEQCOTETRPVEKKIRSALPTKT 

VKPVENKDDDDSIADFLNSDEEEDRVSLQNLKN 

LGESATLRSLLLNPHLRQLMVNLDQGEDKAKLM 

RAYMQEPLFVEFADCCLGIVEPSQNEES 


3646 


A ... . r . ^ 


85 - 


1948 


ERGGGKAAAAAAAAAAARALAASGQDPRPHPR 

APPWDDSGDDDEATTPADKSELrlHTLKNLSLKL 

DDLSTCNDLIAKHGAALQRSLTELDGLKJPSESG 

EKLKVVNERATLFRITSNAMINACRDFLELAEIHS 

RKWQRALQYEQEQRVHLEETIEQLAKQHNSLER 

AFHSAPGRPANPSKSFIEGSLLTPKGEDSEEDEDT 

EYFDAMEDSTSFITVITEAKEDSRKAEGSTGTSSA 

DWSSADNVLDGASLVPKGSSKVKRRVRIPNKPN 

YSLNLWSIMKNCIGRELSRJPMPVNFNEPLSMLQ 

RLTEDLEYflHLLDKAVHCTSSVEQMCLVAAFSV 

SSYSTTVHR1AKPFNPMLGETFELDRLDDMGLRS 

LCEQVSHHPPSAAHYVFSKHGWSLWQEITISSBCF 

RGKYISIMPLGAIHLEFQASGNHYVWRKSTSTVH 

NI1VGKLWIDQSGDIEIVNHKTNDRCQLKFLPYSY 

FSKEAARKVTGWSDSQGKAHYVLSGSWDEQM 

ECSKVMHSSPSSPSSDGKQKTVYQTLSAKLLWK 

KYPLPENAENMYYFSELALTLNEHEEGVAPTDS 

RLRPDQRLMEKGRWDEANTEKQRLEEKQRLSR 

RRRLEACGPGSSCSSEE 


3647 


A. 


46 


5007 


PTGDACVSTSCELASALSHLDASHLTENLPKAAS" 
ELGQQPMTELDSSSDLISSPGKKGAAHPDPSKTS 
VDTGQVSRPENPSQPASPRVTKCKARSPVRLPHE 
GSPSPGEKAAAPPDYSKTRSASETSTPHNTRRVA 
ALRGAGPGAEGMTPAGAVLPGDPLTSQEQRQGA 
PGNHSKALEMTGIHAPESSQEPSLLEGADSVSSR 
APQASLSMLPSTDNTKEACGHVSGHCCPGGSRE 
SPVTDIDSFIKELDASAARSPSSQTGDSGSQEGSA 
QGHPPAGAGGGSSCRAEPVPGGQTSSPRRAWAA 
GAPAYPQWASQPSVLDSINPDKHFTVNKNFLSN 
YSRNFSSFHEDSTSLSGLGDSTEPSLSSMYGDAE 
DSSSDPESLTEAPRASARDGWSPPRSRVSLHKED 
-PSESEEEQIEICSTRGCPNPPSSPAHLPTQAAICPAS 
AKVLSLKYSTPRESVASPREKVACLPGSYTSGPD 
SSQPSSLLEMSSQEHETHADISTSQNHRPSCAEET 
TEVTSASSAMENSPLSKVARHFHSPPIILSSPNMV 
NGLEHDLLDDETLNQYETSINAAASLSSFSVDVP 
KNGESVLENLHISESQDLDDLLQKPKMIARRPIM 

A lirCVT^TXTT^UXTPkriT'tJT 15 CfTCT/TA'DT XjTO A *D pnriC 

A WrtSJilJNivnLNv<Hj 1 rli^KbKliii^v^rLlVLrAKbrDo 

KIQMVSSSQKKGVTWHSPPQPKTNLENKDLSKK 

SPAEMLLTNGQKAKCGPKLKRLSLKGKAKVNSE 

APAANAVKAGGTDHRXPLISPQTSHKTLSKAVS 

QRLHVADHEDPDRNTTAAPRSPQCVLESKPPLAT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phcnylalanine, G^Glycine, H=Histidine, 
I<=Isoleucine, K=Lysinc, L=Le urine, M=Methionine, 
N=Asparagine, P^Proline, Q=Glutamine, R-Arginine, S=Serine, 
^Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










SGPLKPS VSDTSIRTFVSPLTSPKPVPEQGMWSRF 

HMAVLSEPDRGCPTTPKSPKCRAEGRAPRADSG 

PVSPAASKNGMSVAGNRQSEPRLASHVAADTAQ 

PRPTGEKGGNIMASDRLERTNQLKXVEISAEAVSE 

TVCGNKPAESDRRGGCLAQGNCQEKSEIRLYRQ 

VAESSTSHPSSLPSHASQAEQEMSRSFSMAKLAS 

SSSSLQTAIRKAEYSQGKSSLMSDSRGVPRNSIPG 

GPSGEDHLYFTPRPATRTYSMPAQFSSHFGREGH 

PPHSLGRSRDSQVPVTSSWPEAKASRGGLPSLA 

NGQGIYSVKPLLDTSRNLPATDEGDIISVQETSCL 

VTDKIKVTRRHYCYEQNWPHESTSFFSVKQR1KS 

FENLANADRPVAKSGASPFLSVSSKPPIGRRSSGS 

IVSGSLGHPGDAAARLLRRSLSSCSENQSEAGTL 

LPQMAKSPSIMTLTISRQNPPETSSKGSDSELKKS 

LGPLGIPTPTMTLASPVKRNKSSVRHTQPSPVSRS 

KLQELRALSMPDLDKLCSEDYSAGPSAVLFKTEL 

EITPRRSPGPPAGGVSCPEKGGNRACPGGSGPKT 

SAAETPSSASDTGEAAQDLPFRRSWSVNLDQLLV 

SAGDQQRLQSVLSSVGSKSTILTLIQEAKAQSENE 

EDVCFIVLNRKEGSGLGFSVAGGTDVEPKSITVH 

RVFSQGAASQEGTMNRGDFLLSVNGASLAGLAH 

GNVLKVLHQAQLHKDALVVIKKGMDQPRPSAR 

QEPPTANGKGLLSRKTIPLEPGIGRSVAVHDALC 

VEVLKTSAGLGLSLDGGKSSVTGDGPLVIKRVY 

KGGAAEQAGIIEAGDEILAINGKPLVGLMHFDA 

WNIMKSVPEGPVQLLIRKHRNSS 


3648 


A 


337 . 


1564 


KSRLSVTLMPVQLSEHPEWNESMHSLRISVGGLP 

VLASMTKAADPRFRPRWKVVLTFFVGAAILWLL 

CSHRPAPGRPPTHNAHNWRLGQAPANWYNDTY 

PLSPPQRTPAGIRYRIAVIADLDTESRAQEENTWF 

TYLKKGYLTFSDSGDKVAVEWDKDHGVLESHL 

AEKGRGMELSDLIVFNGKLYSVDDRTGVVYQIE 

G SKA VP WVILSDG DG TVEKGFKAE WLA VKDER 

LYVGGLGKEWTTTTGDVVNENPEWVKVVGYK 

GSVDHENWVSNYNALRAAAGIQPPGYLIHESAC 

WSDTLQRWFFLPRRASQERYSEKDDERKGANLL 

LSASPDFGDIAVSrTVGAVVPTHGFSSFKFIPNTDD 

QIIVALKSEEDSGRVASYIMAFTLDGRFLLPETKI 

GSVKYEGIEFI 


3649 


A 


1 


775 . . 


PTRPG S G S AGG ARVG SGEFG VEMAAL APLPPLP A 

QFKSIQHHLRTAQEHDKRDPVVAYYCRLYAMQ 

TGMKBDSKTPECRKFLSKLMDQLEALKKQLGDN 

EAITQEIVGCAHLENYALKMFLYADNEDRAGRF 

HKNMIKSFYTASLLIDVITVFGELTDENVKHRKY 

ARWKATY1HNCLKNGETPQAGPVGDBEDNDIEEN 

EDAGAASLPTQPTQPSSSSTYDPSNMPSGNYTGI 

QIPPGAHAPANTPAEVPHSTGVAK 


3650 


A 


20 


963 


KMAATLGPLGSWQQWRRCLSARDGSRRLLLLL 

LLGSGQGPQQVGAGQTFEYLKREHSLSKPYQGE 

APRPCFLRDWELQVHFKIHGQGKKNLHGDGLAI 

WYTKDRMQPGPVFGNMDKFVGLGWVDTY^ 

EKQQERVFPYISAMVNNGSLSYDHERDGRPTEL 

GGCTAIVRNLHYDTFLVIRYVKRHLTIMMD1DGK 

HEWRDCIEVPGVRLPRGYYFGTSSITGDLSDNHD 

VISLKLFELTVERTPEEEIO.HRDWLPSVDNMKL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C-Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G^lycine, H=His tiding 
I=IsoIeucine, K^Lysine, L»Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^Sertne, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^TJnlinowDj * = Stop codon, / = possib!e nucleotide deletion, 
V=possible nucleotide insertion 










PEMTAPLPPLSGLALFLIVFFSLVFSVFAIVIGHLY 
NKWQEQSRKRFY 


3651 


A 


1 


1218 


RSWAYVKKCKNNMCPNRGLHDGPEPCWLHHA 

AGTVSAVQARGLQPSQSRSRPRVPGLATALAYG 

PAHTPPLSRIGWAMQPPPPGPLGDCLRDWEDLQ 

QDFQNIQVSAAADAGSPPSRVSLAQGQGSGSPGC 

KPSLPAEAEGAAQELENQMKERQGLFFDMEAYL 

PKKNGLYLSLVLGNVNVTLLSKQAKFAYKDEYE 

KFKLYLTIBLILISFTCRFLLNSRVTDAAFNFLLVW 

WCTLTIRESELINNGSRIKGWWVFHHYVSTFLSG , 

VMLTWPDGLMYQKJRNQFLSFSIvrYQSFVQFLQ 

YYYQSGCLYRLRALGERHTMDLTVEGFQS WMW 

RVLTFLLPFLFFGHFWQLFNALTLFNLAQDPQCK 

EWQVLMCGFPFLLLFLGNFFTTLRVVHHKFHSQ 

RHGSKKD 


3652 


A 


640 


164 


. VTTSCIIPFAFGLGVRASERLAEIDMPYLLKYQPM 
MQTIGQKYCMDPAV1AGVLSRKSPGDKILVNMG 
DRTSMVQDPGSQAPTSWISESQWQTTEVLtTRI 
TELQRRFPTWTPDQYLRGGLCAYSGGAGYVRSS 
QDLSCDFCNDVLARAKYLKRHGF 


3653. 


A 


2 


909 


IVRRDWQEVSDIHLAMANCKMTKSIRFPALEHC 

YTGGEVVLPKDQEE WKRRTGLLL YEN YGQSETG 

LICATYWGMKIKPGFMGKATPPYDVQFHMEASV 

ENGUVSMNTADPGSQGITHSLLLQV1DDKGS1LPP 

NTEGNIGIR1BCPVRPVSLFMCYEGDPEKTAKVEC 

GDFYNTGDRGKMDEEGY1CFLGRSDDIINASGYR 

IGPAEVESALVEHPAVAESAVVGSPDPIRGEWK 

AFIVLTPQFLSHDKDQLTKELQQHVKSVTAPYKY 

PRKVEFVSELPKTITGKIERKELRKKETGQM 


3654 


A 


2 


909 


IVRRDWQEVSDIHLAMANCICMTKSIRFPALEHC 

YTGGEVVLPKDQEEWKRRTGLLLYENYGQSETG 

LICATYWGMKnO>GFMGKATPPYDVQFHMEASV 

ENCIIVSMNTADPGSQGITHSLLLQVIDDKGSILPP 

NTEGN1GIR1KPVRPVSLFMCYEGDPEKTAKVEC 

GDFYNTGDRGKMDEEGYICFLGRSDDIINASGYR 

IGPAEVESALVEHPAVAESAVVGSPDPIRGEVVK 

AFWLTPQFLSHDKDQLTKELQQHVKSVTAPYKY 

PRKVEFVSELPKTITGKIERKELRKKETGQM 


3655 


A 


2 . 


2364 


SPGPSLPESAESLDGSQEDKPRGSCAEPTFTDTG 

MVAHINNSRLKAKGVGQHDNAQNFGNQSFEEL . 

RAACLRKGELFEDPLFPAEPSSLGFKDLGPNSKN 

VQNISWQRPKDIINNPLFIMDGISPTDICQGILGDC 

WLLAAIGSLTTCPKLLYRVVPRGQSFKKNYAGIF 

HFQIWQFGQWVNVVVDDRLPTKNDKLVFVHST 

ERSEFWS ALLEKA Y AKLSG S YEALSGGSTMEGL 

EiDFTGGVAQSFQLQRPPQNLLRLLRKAVERSSL 

MGCSIEVTSDSELESMTDKMLVRGHAYSVTGLQ 

DVHYRGKMETLIRVRNPWGRIEWNGAWSDSAR 

EWEEVASDIQMQLLHKTEDGEFWMSYQDFLNN 

FTLLEICNLTPDTLSGDYKSYWHTTFYEGSWRTG 

bbACjOCKNKrO Tr WTNPQFKISLPEGDDPEDDAE 

. GNVWCTCLVALMQKNWRHARQQG AQLQTIGF 

VLYAVPKEFQNIQDVHLKKEFFTKYQDHGFSEIF 

TNSREVSSQLRLPPGEYIIIPSTFEPHRDADFLLRV 

FTEKHSESWELDEVNYAEQLQEEKVSEDDMDQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide . 
sequence 


Amino acid sequence (A«=AIanine OCysteine, D=Aspartic Acid, 
E^GIutamic Acid, F=Phehy (alanine, G=Glycine, H^Histidine, 
I=Isoleucine, K=Lysine, L^Leucine, M=Mettaionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T«Threomne, V*=VaIine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










DFLHLFKWAGEGKEIGVYELQRLLNRMAIKFKS 

FKTKGFGLDACRCMINLMDKDGSGKLGLLEFKI 

LWKKLKKWMDIFRECDQDHSGTLNSYEMRLVIE 

KAGIKLNNKVMQVLVARYADDDLIIDFDSFISCF 

LRLKTTVLKI »LTMDPKNTGHICLSLEQVLGEG W 

EG1CRIAPACPSTPPPPSSDVPGPASCPRLFPPWDL 

LPVSTVAADDHVGIEAL 


3656 


A 


3 


174 


PLCTHYLLPELPEKSSRTSPRSRPGNMLSGDPHLP 
QPLCHCLDHCPCCFSGKRLVA 


3657 


A 


1 


444 


DTRSTYHNAHSLPTYVKSPAPCQMTYnCSPAPCQ 

TQTCYVQGASPCQSYYVQAPASGSTSQYCVTDP 

CSAPCSTSYCCLAPRTFGVSPLRRWIQRPQNCNT 

GSSGCCENSGSSGCCGSGGCGCSCGCGSSGCCCL 

GHPMKSRSPALL 


3658 


A 


92 


1537 


SEAPVQPQPYTMTSFYSTSSCPLGCTMAPGARNV 
FVSPIDVGCQPVAEANAASMCLLANVAHANRVR 
VGSTPLGRPSLCLPPTSHTACPLPGTCHIPGNIGIC 
GAYGKNTLNGHEKETMKFLNDRLAOTLEKVRQ 
LEQENAELETTLLERSKCHESTVCPDYQSYFRTIE 
ELQQKILCSKAENARLTVQIDNAKLAADDFRJKL 
." ESERSLHQLVEADKCGTQKLLDDATLAKADLEA 
QQESLKEEQLSLKSNHEQEVKELRSQLGEKFRJEL 
DIEPTIDLNRVLGEMRAQYEAMVETMHQDVEQ 
WFQAQSEGISLQAMSCSEELQCCQSEILELRCTV 
NALEVERQAQHTLKDCLQNSLCEAEDRYGTELA 
QMQSLISNLEEQLSEIRADLERQNQEYQVLLDVK 
ARLENEIATYRNLTPLQSLFHACLLYFLSKLWPC 
HRWVSLWPWSQHGEMILKARVRRLRLVALGSG 
VPSPCPVFLQD 


3659 


A 


2 


402 


DLLQCLNQLYSASTEMSCQQSQQQCQPPPKCTP 
KCPPKCTPKCPPKCPPKCPPQYSAPCPPPVSSCCG 
SSSGGCCSSEGGGCCLSHHRPRQSLRRRPQSSSC 
CGSGSGQQSGGSSCCHSSGGSGCCHSSGGCC 


3660 


A 


26 


710 


CSAVEVKMAARTAFGAVCRRLWQGLGNFSVNT 

SKGNTAKNGGLLLSTNMKWVQFSNLHVDVPKD 

LTKPVVTISDEPDELYKRLSVLVKGHDKAVLDSY 

EYFAVLAAKELGISIKVHEPPRKIERFTLLQSVHI 

YKKHRVQYEMRTLYRCLELEHLTGSTADVYLEY 

IQRNLPEGVAMEVTKFCFFIFLDT1RTVTRTHQGA 

NLGNT1RRKRRKQVIKPQGGHFCLNLK 


3661 


A 


2 


370 


DVSVAASEPTVYRNPTICMSCQQNQQQCQPPPKC 
PIPKYPPKCPSKCASSCPPPISSCCGSSSGGCCSSG 
GCGCCSSEGGGCCLSHHRHHRSHCHRPKSSNCY 
GSGSGQQSGGSGCCSGGGCC 


3662 


A 


205 


1277 


RICSLPHPNPQKMLKKPLSAVTWLCIFIVAFVSHP 

AWLQKLSKHKTPAQPQLKAANCCEEVKELKAQ 

YANLSSLLSELNKKQERDWVSWMQVMELESN 

SKEIMESRLTDAESKYSEMNNQEDIMQLQAAQTV 

TQTSAGKETSPLRERGVPPHLQHCFYIPPDDFLGS 

PELEWCDMETSGGGWTHQRRICSGLVSFYRDW 

KQ Y KQGFG SIRGDFWLGNEHIHRLSRQPTRLRVE 

MEDWEGNLRYAEYSHFVLGNELNSYRLFLGNY 

TGNVGNDALQYHNNTAFSTKDKDNDNCLDKCA 

QLRKGGYWYNCCTDSNLNGVYYRLGEHNKHLD 

GITWYGWHGSTYSLKRVEMKIRPEDFKP 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteinc, D»Aspartic Acid, 
E*=G!utamic Add, F-Pbcnylalanine, G^Glycint, H=Histidine, 
I=IsoIeurine, K-Lysine, k=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q^GIutamine, R^Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y-Tyrosine, 
X=Uriknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 


3663 


A 


64 


1456 


LSSAKETLAQMYNTVWNMEDLDLEYAKTDINC 

GTDLMFYmMDPPALPPKPPKPTTVANNGMNNN 

MSLQDAEWYWGDISREEVNEKLRDTADGTFLV 

RDASTKMHGDYTLTLRKGGNNKLIKDFHRDGKY 

GFSDPLTFSSVVELINHYRNESLAQYNPKLDVKL 

LYPVSKYQQDQ WKEDNIEAVGKKLHEYNTQFQ 

EKSREYDRLYEEYTRTSQEIQMKRTAIEAFNETTK 

IFEEQCQTQERYSKEYIEKFKREGNEKEIQRIMHN 

YDKLKSRJSEIIDSRRRLEEDLKKQAAEYRETDKR 

MNSDCPDLIQLRKTRDQYLMWLTQKGVRQKKL . 

NEWLGNENTEDQYSLVEDDEDLPHHDEKTWNV 

GSShnWKAENLLRGKRDGTFLVRESSKQGCYAC 

SVVVDGEVKHCVINKTATGYGFAEPYNLYSSLK 

ELVLHYQHTSLVQHNDSLNVTLAYPVYAQQRR 


3664 


A 


944 


406 


GATVEDQSCNFGSLRWWSVPHISARSCPDPLLS 

RTGRVPGGRGAGLPRHHSPRCCLQVFFNGANVR 

QVDVPTLTGAFGILAAHVPTLQVLRPGLVVVHA 

EDGTTSKYFVSSGSIAVNADSSVQLLAEEAVTLD 

MLDLGAAKANLEKAQAELVGTADEATRAEIQIR 

IEANEALVKALE 


3665 


A . 


98 


1388. 


ASQLAFGGKLTSTPSRDFQGCGRGAVTCCSFHEH 

RHQSGRCLSTGMAPNLKGRPRKKKPCPQRRDSF : 

SGVKX)SNNNSDGKAVAKVKCEARSALTKPKNN 

HNCKKVSNEEKPKVAIGEECRADEQAFLVALYK 

YMKERKTPIERIPYLGFKQINLWTMFQAAQKLG 

GYETTTARRQWKHIYDELGGNPGSTSAATCTRR 

HYERLELPYERFIKGEEDKPLPPIKPRKQENSSQE 

NENKTK VSGTKEJKHEIPKSKKEKENAPKPQD AA 

EVSSEQEKEQETLISQKSIPEPLPAADMXKKIEGY 

QEFSAKPLASRVDPEKDNETDQGSNSEKVAEEA 

GEKGPTPPLPSAPLAPEKDSALVPGASKQPLTSPS 

ALVDSKQESKLCCFTESPESEPQEASFPRLPHHTG 

HRWQTRMRRRMTNCPPWQITLPTAP 


3666 


A 


113 


1492 


LLQEMCTKTIPVLWGCFLLWNLYVSSSQTIYPGI 

KARITQRALDYGVQAGMKMIEQMLKEKKLPDL 

SGSESLEFLKYDYVNYNFSN1KISAFSFPNTSLAF - 

WGVGIKALTNHGTANISTDWGFESPLFVLYNSF 

AEPMEKPILKNLNEMLCPIIASEVKALNANLSTLE 

VLTKIDNYTLLDYSLISSPEITENYLDLNLKGVFY 

PLENLTDPPFSPVPFVLPERSNSMLY1G1AEYFFKS 

ASFAHFrAGVFWTLSTEEISNHFVQNSQGLGNV 

LSRIAEIY1LSQPFMVRIMATEPPIINLQPGNFTLDI . 

PASMMLTQPKNSTVETIVSMDFVASTSVGLVIL 

GQRLVCSLSLNRFRLALPESNRSNIEVLRFEN1LSS 

ILHFGVLPLANAKLQQGFPLPNPHKFLFVNSDIEV 

LEGFLLISTDLKYETSSKQQPSFHVWEGLNLISRQ 

WRGKSAP 


3667 


A 


1 


181 


FRGFU.GSGRNGGGSMNAPPAFESFLLFEGEKITIN 
KDTKWNACLFTINKEDHTLGNIIIC 


3668 


A . 


212 


431 


VAGEAVPFFPMMYSEPLKPSYLALVLWYFLLTG 
YClTKPEVlr KJEQGEEPWJLEKGFPSQCriPAKYL 
WCLHD 


3669 


A 


458 


1056 


FSGVCFAGIAGSMATLLHDAVMNPAEVVKQRLQ 
MYNSQHRSAISCIRTVWRTEGLGAJYRSYTTQLT 
MNIPFQSIHFITYEFLQEQVNPHRTYNPQSHIISGG 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A<=A)anine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, ^Phenylalanine, OGlycine, H=Histidinc, 
I=IsoIeucine, K=Lysine, L=Lcucinc, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R«Arginine, &=Serine, 
T=Threonine, V«=Valinc, W=Tryptophan, Y^Tyrosine, 
X^Unknown, *=Stop codon, ^possible nucleotide deletion, 
V=possible nucleotide insertion 










T AT AT A A A A TTT1T TWTfVWTT T X"n^/"\T?xr\ 7 A T OT A \TTO 

JLAUALAAAA 11 PLDVCKTLLNTQENVALSLANIS 
GRLSGMANAFRTVYQLNGLAGYFKGIQARVIYQ 
MPSTAISWSVYEFFKYFLTKRQLENRAPY 


3670 


A . 


145 


298 


RNPCPLTFLPSTLMVLLLSLTFFSALTFHSICQLRN 
TGVEVDIVFQRVSFL 


urn l 
3671 


A 


3 


462 


ILKVAKKERTMSSLPVPYKLPVSLSVGSCVIIKGT 

PmSFMDPQLQVDFYTDMDEDSDIAFRFRVHFG 

NHVVMNRREFGIWMLEETTDYVPFEDGKQFELC 

IYVHYNEYEIKVNGHTHLRALSHRIPPSFVEDGC 

KCPRRYLPWTSVCVCN 


3672 


A 


1 


1028 


HYAKLGTRPRLKFMSSPSLSDLGKREPAAAADE 

RGTQQRRACANATWNSIHNGVIAVFQRKGLPDQ 

ELFSLNEGVRQLLKTELGSFFTEYLQNQLLTKGM 

VILfUDKIRFYEGQKLLDSLAETWDFFFSDVLPML 

QAIFYPVQGKEPSVRQLALLHFRNAITLSVKLED 

ALARAHARVPP AIV QMLLVLQG VHESRG VTEDY 

LRLETLVQKWSPYLGTYGLHSSEGPFTHSCILEK 

RLLRRSRSGDVLAKNPVVRSKSYNTPLLNPVQE 

HEAEGAAAGGTSIRRHSVSEMTSCPEPQGFSDPP 

GQGPTGTFRSSPAPHSGPCPSRLYPTTQPPEQGLD 

PTRS 


3673 


A . 


2 


712 


RPPRVWYPELRELSAAAPRWSHRTAPGIMVFYF 

TSSSVNSSAYTIYMGKDKYENEDLIKHGWPEDI 

WFHVDKLSSAHVYLRLHKGENIEDIPKEVLMDC 

AHLVKANSIQGCKM^O^^VNVVYTPWSNLKKTAD 

MDVGQIGFHRQKDVKIVTVEKKVNEILNRLEKT 

KVERFPDLAAEKECRDREERNEKKAQIQEMKKR 

EteEMKKKREMDELRSYSSLMKVENMSSNQDG 

NDSDEFM 


3674 


A 


2 . 


712 


RPPRVWYPELRELSAAAPRWSHRTAPGIMVFYF 

TSSSVNSSAYTIYMGKDKYENEDLIKHGWPEDI 

WHVDKLSSAHVYLRLHKGENIEDIPKEVLMDC 

A T TT "\ fT/ A *X TOTA /"^T-FTk J"V TV TT F*V TT FT FX fill *TT F f"l~fc. TT T TT T FT^ A T*v 

AHLVKANSIQGCKMNNV^A^ 
. MDVGQIGFHRQKDVKIVTVEKKVNEILNRLEKT 
KVERFPDLAAEKECRDREERNEKKAQIQEMKKR 
EKEENITCKK^MDELRSYSSLMKVENMSSNQDG 
NDSDEFM 


3675 , 


A 


921 


1321 


VTLAKMRVHISSCLKVQEQMANCPKFVPVVPTS 
QPIPSNPNRSTFACPYCGARNLDQQELVKHCVE 
SHRSDPNRVVCPICSAMPWGDPSYKSANFLQHL 
LHRHKFSYDTFVDYSIDEEAAFQAALALSLSEN 


3676 


A 


3 


1856 


TLGRWLLGVYETVAPTLACLPRPRLRRRRRRRR 

RRN4ISRYTRKAVPQSLELKGITKHALNBHPPPEK 

LEEISPTSDSHEKDTSSQSKSDITRESSFTSADTGN . 

SLSAFPSYTGAGISTEGSSDFSWGYGELDQNATE 

KVQTMFTAIDELLYEQKLSVHTKSLQEECQQWT 

ASFPHLRILGRQIITPSEGYRLYPRSPSAVSASYET 

TLSQERDST1FGIRGKKLHFSSSYAHKASSIAKSSS 

FCSMERDEEDSIIVSEGIIEEYLAFDHIDIEEGFHG 

KKSEAATEKQKLGYPPIAPFYCMKEDVLAYVFD 

bv WLKV VbCMbyL 1 Kbri Wb(jr AoUUbSN VAVT 

RPDSESSCVLSELHPLVLPRVPQSKVLYTTSNPMS 

LCQASRHQPNVNDLLVHGMPLQPRNLSLMDKLL 

DLDDKLLMRPGSSTILSTRNWPNRAVEFSTSSLS 

YTVQSTRRRNPPPRTLHPISTSHSCAETPRSVEEIL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding . 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCystcine, I>=Aspartic Acid, 
E=GIutamic Acid, ^Phenylalanine, G=GIycine, H=Histidine, 
Msoieucinc, K=Lysine, L=Leucine, M=Methionine,* . 
N=Asparagine, P=Prolirie, Q^GIutamine, R=Argininc, S=Serinc, 
T=Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X=Unkno>vn, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










RGARVP VAPD SLS SPSPTPLSRNNLLPPIGTAEVE 

HVSTVGPQRQMKPHGDSSRAQSAVVDEPNYQQ 

PQERLLLPDFFPRPNTTQSFLLDTQYRRSCAVEYP 

HQARPGRGSAGPQLHGSTKSQSGGRPVSRTRQG 

P 


3677 


A 


246 


757 


MRLQGAIFVLLPHLGPILVWLFTRDHMSGWCEG 

PRMLSwCPFYKVLLLVQTAIYSVVGyASYLVWK 

DLGGGLGWPLALPLGLYAVQLTISWTVLVLFFT 

VHNPGLALLHLLLLYGLVVSTALIWHPINKLAAL 

LLLPYLAWLTVTSALTYHLWRDSLCPVHQPQPT 

EKSD 


3678 


A ' 


20 


1508 


RGKAEFFLAMAGTN ALLMLENFIDGKFLPCS S YI 

DSYDPSTGEVYCRVPNSGKDEIEAAVKAAREAFP 

SWSSRSPQERSRVLNQVADLLEQSLEEFAQAESK 

DQGKTLALARTMDIPRSVQNFRFFASSSLHHTSE 

CTQMDHLGCMHYTVRAPVGVAGLISPWNLPLY 

LLTWiOAPAMAAGNTVLAKPSELTSVTAWMLCK 

LLDKAGVPPGWNIVFGTGPRVGEALVSHPEVPL 

ISFTGSQPTAERJTQLSAPHCKKLSLELGGKNPAn 

FEDANLDECIPATVRSSFANQGEICLCTSRIFVQK 

SIYSEFLKRFVEATRKWKVGEPSDPLVSIGALISK 

AHLEKVRSYVKJIALAEGAQIWCGEGVDKLSLPA 

RNQAGYFMLPTVITDIKDESCCMTEEIFGPVTCV 

VPFDSEEEV1ERANNVKYGLAATVWSSNVGRVH 

RVAKKLQSGLVWTNCWLIRELNLPFGGMKSSGI 

GREGAKDSYDFFTEIKTITVKH 


3679 

-/' 


A 


1862 


502 


MAGTOYMEIQTTIREYYEHLYANKLENLEEMD 

KFLDTYTLPRLNQEEVESLNRPITGSEEEAIINSLP 

TKIQPGPDRFTAKFYQRYKEELSNLIHYLGLSHH : 

LLALNFDVSFGKKSAWSSAQVKVTDTDFDGVEV 

RVFEGPPKPEEPLKRSVVYIHGGGWALASAKIRY 

YDELCTAMAEELNAVIVSIEYliLVPKVYFPEQIH : 

DVVRATKYFLKPEVLQKYMVDPGR1CISGDSAG 

GNLAAALGQQFTQDASLKNKLKLQALIYPVLQA 

LDFNTPSYQQNVNTPILPRYVMVKYWVDYFKG 

NYDFVQAMrWNHTSLDVEEAAAVRARLNWTS 

LLPASFTKNYKPWQTTGNARIVQELPQLLDARS 

APLIADQAVLQLLPKTYILTCEHDVLRDDGIMYA 

KRLESAGVEVTLDHFEDGFHGCMIFTSWPTNFSV 

GIRTRNSYIKWLDQNL 


.3680 


A 


249 


2146 


RSWGAPWFWRMRLLRRRHMPLRLAMVGCAFV 

LFLFLLHRDVSSREEATEKPWLKSLVSRKDHVLD 

LMLEAMNNLRDSMPKLQIRAPEAQQTLFSINQSC 

LPGFYTPAELKPFWERPPQDPNAPGADGKAFQK 

SKWTPLETQEKEEGYKKHCFNAFASDRISLQRSL 

GPDTRPPECVDQKFRRCPPLATTSVIIVFHNEAWS 

TLLRTV Y S VLHTTP AILLKEIIL VDD ASTEEHLKE 

KLEQ YVKQLQV VRVVRQEERKGLITARLLG AS V 

AQAEVLTFLDAHCECFHGWLEPLLARIAEDKTV 

VVSPDIVTBDLNTFEFAKPVQRGRVHSRGNFDWS 

T TCriUrPTT DDTJrT?fc r fYP P V"Pit?T*V15TV A f%/~ll T?CT 

L> 1 r u W Jd 1 L.r r n&iSJ^isJsjSAJE, 1 i rllvor 1 r AvjOJLr M 

SKSYFEHIGTYDNQMEIWGGENVEMSFRVWQC 

GGQLEnPCSVVGHVFRTKSPHTFPKGTSVIARNQ 

VRLAEVWMDSYKKJFYRIO^LQAAKMAQEKSFG 

DISERLQLREQLHCHNFSWYLHNVYPEMFVPDL 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C^Cysteine, D^Aspartic Acid, 
{^Glutamic Acid, ^Phenylalanine, G=GJycine,H=Histidine, 
I=Isoleucine, K=Lysine, l^Leucine, M-Methionine, 
N=Asparaginc, P^ProIine, Q=Glutamine, R^Arginine, S^Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, ^possible nucleotide deletion, 
^possible nucleotide insertion 










TPTFYGAIKNLGTNQCLDVGENNRGGKPLIMYS 
CHGLGGNQYFEYTTQRDLRHNIAKQLGLHVSKG 
ALGLGSCHFTGKNSQVPKDEEWELAQDQLERNS 
GSGTCLTSQDKKPAMAPCNPSDPHQLWLFV 


3681 


A 


2982 


1869 


LKDTLKSQMTQEASDEAEDMKEAMNRMIDELN 

KQVSELSQLYKEAQAELEDYRKRKSLEDVTAEY 

IHKAEHEKLMQLTNVSRAKAEDALSEMKSQYSK 

VLNELTQLKQLVDAQKENSVSITEHLQVITTLRT 

AAKEMEEKISNLKEHLASKEVEVAKLEKQLLEE 

KAAMTDAMVPRSSYEKLQSSLESEVSVLASKLK 

ESVKEKEKVHSEVVQIRSEVSQVICREKENIQTLL 

KSKEQEVNELLQKFQQAQEELAEMKRYSESSSK . 

LEEDKDKKINEMSKEVTKLKEALNSLSQLSYSTS 

SSKRQSQQLEALQQQVKQLQNQLAECKKQHQE 

VISVYRMHLLYAVQGQMDEDVQKVLKQILTMC 

KNQSQKK 


3682 


A 


447 


1024 


AQALTAGRQLALAAPFIAPISPISLPRLNPPSQSW 

NSTPFFKVKLPPQKEV1TSDELMAHLGNCLLSIKP 

QEKSEGLQLNFQQNVDDAMTVLPKLATGLD VN 

VRFTGVSDFEYTPECSVFDLLGIPLYHGWLVDPQ 

QSPEAVRAVGKLSYNQL/VGEDHHLQTLQ*HQP 

RDRKPDCRAVPGDHRGPSDLPRTV 


3683 


A 


2 


942 


LEIKQEEIO^GQCII<GEELMHGECVKEEKDFLKKE 

IVDDTKVKEEPPINHPVGCKRKLAMSRCETCGTE 

EAKYRCPRCMRYSCSLPCVKKHKAELTCNGVRD 

KTAYISIQQFTEMNLLSDYRFLEDVARTADHISR 

DAFLKRPISNKYM YFMKNRARRQGINLKLLPNG . 

FTKRKENSTFFDKXKQQFCWHVKLQFPQSQA\ST 

*KKRVPDDKTINEILKPYIDPEKSDPV1RQRLKAYI 

RSQTGVQILMKIEYMQQNLVRYYELDPYKSLLD ' 

NLRNKVIIEYPTLHVVLKGSNNDMKVLHQVKSE 

STKNVGNEN 


3684 


A ■ 


119 


1533 


SLQENVQEKRVRVCPGLGGLLPNGTPSITAAAAP 

QVLWRHVQPGCSHHLHACVIRAACRAGEGHAD 

RHAGPPET/PVTLPSSWPWSSPWERQCPMH\L*AP 

GHAFRPVPTEHRRGWAALGHHRAAAGPLREPAS 

GSQPAPASC*PECHHGCPEQTRQCQDLLREAW 

APEQRG*PCAHLQT*ATATTLCPQVPAGRVWQP 

GHSCHLLPHRHDGSH*HHCAAHRRPVTRRQAAH 

GVPLPDACYSPHHTLPAAPPPATRPAGHTATHPE 

♦GGDLTPVPDGPHDCPRDVQGIPGAGGGSQLAPC 

CPPFPAAPVSVQGTQGLGPKNVLH*QWEGIRWQ 

KEPE/PGPPPEVELKRGAKCRIGDHGLGAVLGQG 

EYAS*SPSIPW*ASSSACPPLHPTP/TVYTQSPAAA 

PGWTRPPSP/PPPGLYPGP/PASHAPGVRGGISHQL 

YSLP*LCRECCSCP/PPPPAHGGRCPSLLPPEALAK 

T T T v 

LLL 


3685 . 


A 


101 


438 


AWVLQCKINTELQTEVVMLKSMVLWLGEQVQS 
LQLQQQLHCHFNHTmCVTNLEYN\KEYPWDLV 
KAHLQGASTSNITFDIGELQKKULDLNKQTQEFQ 


3686 


A 


105 


845 


VSDWBCNQLVEVQCRQDGCDAVENVHQMFMF 
NWFTDCLWTLFLSNYQPSVESSSPGGSATSDDHE 
FDPS ADMLVHDFDDERTLEEEEMMEGETNFS SEI 
EDLAREGDMPIHELLSLYGYGSTVRLPEEDEEEE 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, ENHtstidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Mcthionine, 
N=Asparagine, P=Proline, Q=Glutaminc, R=Arginine, S=Serine, 
T^Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 










EEEEEGEDDEDAD^fDDNSGCSGENKEENIKDSS 
GQEDETQSSNDDPSQSVASQDAQEIIRPRRCKYF 
DTNSEVEEESEEDEDYIP/SnSFFQSSDGI*SSSSSE 
DWKKEIMVGS 


3687 


A 


49 


1225 


PVLVTSLRMREADTLRPPQLMEVSADIISTVEFN 
HTGELLATGDKGGRWIFQREPESKNAPHSQGE 
YDVYSTFQSHEPEFDYLKSLEIEEKINKJKWLPQQ 
NAAHSLLSTNDKTIKLWKITERDKRPEGYNLKDE 
EGKLKDLSTVTSLQVPVLKPMDLMVEVSPRRIFA 
NGHTYHINSISVNSDCETYMSADDLRINLWHLAI 
TDRSFTPVNIVDIKPANMEDLTEVITASEFHPHHC 
J^FVYSSSKGSLRLCDMIIAAALCDKHSKLFEEPE 
- DPSNRSFFSEnS\SVSDVKFSHSDRYMLTR\DYLT 
VKVWDL>nvEARPffiTYQVro YLRSKLCSLYEND 
CIFDKFEGAWNGSDR/IIMTGAYNNFFRMFDRNT 
KRDVTLEASRGSSKPRAVL 


3688 


A 


1 


401 


KKVPGRLSEMSFSLNFTLPANTTSSPVTVDCGPSL 
GLAAGIFLLVATALLVALLI^LIHRRRSSIEAMEE 

CHD TJ/^T?TOT?TT\TVK.TT">T/' TOT?"Vm'T) T"> pnTTJCT/\TTK //*« A /~\T* 

oJL^KrCblbJilDDNPlUbENPl^ 

ArtL i V is. l V Au J> c c*r V rtUK Y Kr 1 i bMbKKK 


3689 


A 


698 


889 


GRVLVHCAMGVSRSATLVLAFLMIYENMTLVEA 
IPDGAGPPQISALTQAFVRQLQVLDNRLGRE 


3690 


A 


61 


153 


MGAHLVRRYLGDASVEPDPLQMPTFPPDYGF 


3691 


A 


61 


153 


MGAHLVRRYLGDASVEPDPLQMPTFPPDYGF 


3692 


A 




2831 


PLVRRLLRQTLRRVGGARAVREAVMRAVLTWR 
DKAEHCINDIAFKPDGTQLILAAGSRLLVYDTSD 
GTLLQPLKGHKDTVYCVAYAKDGKRFASGSAD 
KSVUWTSKLEGILKYTHNDAIQCVSYNPITHQLA 
SCSSSDFGLWSPEQKSVSKHKSSSKIICCSWTNDG 
QYlALGMFNGIISlRls[KNGEEKVKIERPGGSLSPI 
WSICWNPSSRWESFWMNRENEDAEDVIVNRYIQ 
EIPSTLKSAVYSSQGSEAEEEEPEEEDDSPRDDNL 
EERNDILAVADWG\QKVSFYQLSGKQIGKDRAL 
NFDPCCISYFTKGEYDLLGGSDKQVSLFTKDGVR 
LGTVGEQNSWVWTGQAKPDSNYVVGGCQDGTI 
SFYQLIFSTVHGLYKDRYAYRDSMTDVIVQHLIT 
EQKVRIKCKELVKKIAIYRNRLAIQLPEKILIYELY 
SEDLSDMHYRVKEKin<CKFECNLLVVCANHIILC . 
QEKRLQCLSFSGVKEREWQMESLIRYIKVIGGPP 
GREGLLVGLKNGQILKIFVDNLFAIVLLKQATAV 
RCLDMSASRKKLAVVDENDTCLVYDIDTKELLF 
QEPNANSVAWNTQCEDMLCFSGGGYLNIKASTF 
PVHRQKLQGFVVGYNGSKIFCLHVFSISAVEVPQ 
SAPMYQYLDRKLFKEAYQIACLGVTDTDWRELA 
MEALEGLDFETAKKERKKRGETNNDLFLADVFS 
YQGKFHEAAKLYBCRSGHENLALEMYTDLCMFE 
. YAKIDFLGSGDPKET1<MLITKQADWARNIKEPKA 
AVEMYISAGEHVKAIEICGDHG WVDMLIDIARK 
LDKAEREPLLLCATYLKKLDSPGYAAETYLKMG 
DLKSLVQLHVETQRWDEAFALGEKHPEFKDDIY 
MPYAO WL AFNDRFFF A OK A WTK ArjR n*R P A VHV 

LEQLTNNAVAESRFNDAAYYYWMLSMQCLDIA 
QDPAQKD 


3693 


A 


3. 


-1099 


SSFPTCMRTVFHSNTSVSSLLHRPGHVTPQLTIHG 
GWRHHRDHTAIDEWDFNPSKFLrYTCLLLFSVLL 
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SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino . 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
. acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H-Histidint, 
Msoleucine, K=Lysine, L=Leucine, MNMethionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=5erine, 
T^Threonine, V= Valine, W=Tryptophan, Y=Tyrosine, 
X=Unkno\vn, *=Stop codon, /^possible nucleotide deletion, 
V=possibIe nucleotide insertion 










PLRLDGIIQWSYWAVFAPIWLWKLLWAGASVG 

AGVWARNPRYRTEGEACVEFKAMLIAVGIHLLL 

LMFEVLVCDRVERGraFWLLVFMPLFFVSPVSV 

AACVWGFRHDRSLELEILCSVNILQFIFIALKLDRI 

IHWPWLVVFVPLWILMSFLCLVVLYYIVWSLLFL 

RSLDVVAEQRRTHVTMAISWITIVVPLLTFEVLL 

VHRLDGHOTFSYVSIFVPLWLSLLTLMATTFRRK 

uGNHW WFAlKl^F/CQDQLPQ 

HGEKALPLQNKDRGSWPASRGSPRLL 


. 3694 


A 


483 


761 


PRSLIDYKSYMDTKLLVARFLEQSSCTMTPDIHE 
LVENIKSVLKSDEEHMEEAITSASFLEQIIVIAHSX 
QHIRAHKLPXETAGLXTSELRXLTP 


3695 


A 


483 


761 


PRSLJDYKSYMDTKLLVARFLEQSSCTMTPDIHE 
LVENIKSVLKSDEEHMEEAITSASFLEQIMAHSX 
QHIRAHKLPXETAGLXTSELRXLTP 


3696 


A 


456 


733 


LSAALWEEPILSLWSETKELTNRGKMNYPQIGPH 
RPHVKGLRVRPGPGTLSNAPKSLCPGMSNSDRGI 
H\GGEGQGPGKRAGHLGRGGGMSFL 


3697 


A 


877 \. . 


1873 


VWL*TLS*HTCALMTVCRSCLVKYLEENNTCPT 
CRIVIHQSHPLQYIGHDRTMQDIVYKLVPGLQEA 
EMRKQREFYHKLGMEVPGDDCGETCSAKQHLDS 
HRNGETKADDSSNKEAAE 


3698 


A 


1 


572 


KQCGIPHEVVRDENSSVYAEVSRLLLATGHWKR 

LRRDNPRPNLMLGERNRLPFGRLGHEPGLVQLV 

NYYRGADKLCRKASLVKLIKTSPELAESCTWFPE 

SYVIYPTNLKTPVAPAQNGIQPPISNSRTDEREFFL 

ASYNRKKEDGEGNVW1AKSSAGAKVWVQW*M 

TDLEEEIDIPSPVGLGLESEWPL 


3699. 


A 

A 


2008 


2432 


LHCKMGALETQTHPCSQNMLRSLQKCCCKVEE 
HHLQPVQ VLQTLLHS ATAGTGCRRPARPPPAPPT 
PTPWRSRQSGKQSERAS*LKGRGRYGLGALGGR 
GGRALGG SRWPPPLPGETLFSGCKHRRRRRG SD 
AAPGEEAGT 


3700 


A 


33 


1318 


GYQIGMALASGPARRALAGSGQLGLGGFGAPRR 

GAYEWGVRSTRKSEPPPLDRVYEPGLEPITFAG 

KMHFVPWLARPIFPPWDRGYKDPRFYRSPPLHE 

HPLYKDQACYIFHHRCRLLEGVKQALWLTKTKL 

mGLPEKVLSLWDPRIffllENQDECVLNVISHARL 

WQTTEEIPKRETYCPVIVDNLIQLCKSQILKHPSL 

ARRICVQNSTFSATWNRESLLLQVRGSGGARLST 

KDPLPmSMEmATKMT^ETFYPISPnDLHECN 

1YDVKNDTGFQEGYPYPYPHTLYLLDKANLRPH 

RLQPDQLRAKMILFAFGSALAQARLLYGNDAKV 

LEQPWVQSVGTDGRWl^WQLNTTDLDSNE 

GVKNLAWVDSDQLLYQHFWCLPVIKKRVVVEP 

VGPVGFKPETFRKFLALYLHGAA 


3701 


A 


86 . 


465 


WTLCGPEAGMVGYDPKPDGRNNTKFQVAVAGS 
VSGLVTRALISPFDVTKIRFQLQHERLSRSDPSAK 
YHGILQASRQELQEEGPTAFWKGHVPAQILSIGY 
GAVQFLSFEMLTELVHRGSVYDARE 


3707 


A 




O l*r 


ur WJiivlPs^ooritMVUJr^ VlJVJJli J roWUJL> 
SCQDELNSSDTTAEIFQEDTVRSPFLYNKDVNGK 
VVLWKGDVALLNCTAIVNTSNESLTDKNPVSESI 
FMLAGPDLKEDLQKLKGCRTGEAQLTKGFNLAA 
miHTVGPKYKSRYRTAAESSLYSCYRNVLQLA 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corrcspo nd ing 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A~Alanine C=Cysteine, D=»Aspartic Acid, 
E^Glutamic Acid, ^Phenylalanine, (^Glycine, H=Iiistidine, 
I=IsoIeucine, KpLysine, L=Leucine, M=Methionine, 
N— Asparagine, P=Proline t Q=Glutan)ine, R=Arginine, S=Serine, 
T-Threonirie, V=Valine, W-Tryptophan, Y=Tyrosine, 
X=Unkno\vn, *=Stop codon, A=possibIe nucleotide deletion, 
^possible nucleotide insertion 










KEQSMSSVGFCVINSAKRGYPLKDATHIALRTVR 
RFLEIHGETIEKVV 


3703 


A 


128 • 


1255 


SLGPSPKSATIPCCGDTMAPEEDAGGEALGGSFW 

EAGNYRRTVQRVEDGHRLCGDLVSCFQERAR1E 

KAYAQQLADWARKWRGTVEKGPQYGTLEKAW 

HAFFTAAERLSALHLEVREKLQGQDSERVRAWQ 

RGAFHRPVLGGFRESRAAEDGFRKAQKPWLKRL 

KEVEASKKSYHAARJCDEKTAQTRESHAKADSA 

VSQEQLRKLQERVERCAK^AEKTKAQyEQTLAE 

JLrrij\ i i riv i ivjuCrJ^iviriv^/\jr JD i AAiliv*s^ivL.Lrr r iSJJ 

MLLTLHQHLDLSSSEKFHELHRDLHQGIEAASDE 
EDLRWWRSTHGPGMAMNWPQFEEWSLDTQRTI 

^RKFK frOR ^PTYFVTT T^TVPTPnnTAPPPnQPncrp 

GTGQDEEWSDEESP 


3704 


A 


1 


271 


ARGEDLALATGGGPDTVTHSNMPCPNSLVYDC 

WLNIKECSVGEHTFEDLGLCPGRNQREKKRSYK 

DFLREEEKIAAQVRNSSKKKLKDSE 


3705 


A ' 


170 


1318 


LNWANLVIMWPREEEKEKVQDYSLGGLSPDLR1 

DVSRKKKILKAYDEDEDEDLYPDIHPPPSLPLPG 

QFTCPQCRKSFTRRSFRPNLQLANMVQ1IRQMCP 

TPYRGNRSNDQGMCFKHQEALKLFCE VDKEAIC 

VVCRESRSHKQHSVLPLEEVVQEYKAKLQGHVE 

PLRKHLEAVQKMKAKEERRVTELKSQMKSELA 

AVASEFGRLTRFLAEEQAGLERRLREMHEAQLG 

1? A n A A A QP T A ETO A A fM OD T T A "D A ACT? Qr\r\nm T5 
KAVJ AAA oivLi Ally AA K^L, oiSJLX AJlAV^cKo^^ OurL-K 

LLQDIKETFNRCEEVQLQPPEVWSPDPCQPHSHD 
FLTDAIVRKMSRMFCQAARVDLTLDPDTAHPAL 
MLSPDRRGVRLAERRQEVADHPKRFSADCCVLG 
AQGFRSGRHYWEVCMGP 


3706 


A 


204 


1996 


SRERQTTWMDHNFAPAPPEMQSHGAPGPGTSFS 

HSHVLGRPIRPSRLPGGGSPLTPVLRKTIHLDTFP 

QSHIPQTSSRLGLGARTRSVPPQETGIALGASLSP 

LPTSSLVPRKLSSISLTLHQNSQARSLDRPLSHWE 

ELPTPGKKAAPHEGGRVSSPGSPPVTLVPGGRVH 

SEGPGNPGLTKSNRMLATEICPLVSSYLALPFQSR 

LAQSAPVLAEPGSLGQGHLVSVTDHMPTRASPG 

KGKPRARGIPRPRGRLQRANTTVNLTAMDTRTD 

AARHLATMATNRPSLAINLATPNTSQLDTGTEFP 

ALD1KLGTARDLSSVGTVKSGKTVNLATAGTIKP 

GTAMNLTTVGTIXPGMVMDLIASEPDKLGKAM 

ATRSTAKPDMTTCG1AMDSATSDPVKPDTITATV 

GTSRLETAMALARVNRAKLGTAKNSLALDTSR 

lvivj x r\v vjro v vrvi rur/x i vjjv I i JLOo v IN iNL> 1 io U V 

ATCLLMPSRSTDLALDNTNAAMDRATEPASLDL 
ATEYKGKCRNLVGDGLGCREGEVCELGDGSMK 
PMSINSNLLGYIGmTimQMRKK^lKTGFDFNIM 
WGTEGCGAAAGLVAGSTKDPISFPQ 


3707 


A 


3 


549 


SSSISRDFLGQAACASGTMLRWLRDFVLPTAACQ 

NLGIPLGQDAEEKIFTTGDVNKDGKLDFEEFMKY 
LKDHEBQCMKLAFKSLDKNNDGKIEASEIVQSLQ 
TLGLTISEQQAELILQSIDVDGTMTVDWNEWRD 
YFLFNP V TDIEEIIR 


3708 


A 


1 


1866 


EFRGAGRANMLAPRGAAVLLLHLVLQRWLAAG 
AQATPQVFDLLPSSSQRLNPGALLPVLTDPALND 
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SEQ n> 
NO; 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A<=AIanine C=Cysteine, D=Aspartic Acid, . 
E=Glutamic Acid, F=Pheny (alanine, G==Glycine, H»Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M-Methionine, 
N=Asparagine, F=Proline, Q=Clu taming R=Arginine, S=Serine, 
T»=Threonine, V=Valine,\V=Tryptophan, Y=Tyrosine, 
X=Unkno\vn, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










LYVISTFKLQTKSSATIFGLYSSTDNSKYFEFTVM 

GRLSKAILRYLKIsTDGKVHLVX^mQLADGRJlH 

RILLRLSKLQRGAGSLELYLDCIQVDSVHNLPRA 

FAGPSQKPETIELRTFQRKPQDFLEELKLVVRGSL 

FQVASLQDCFLQQSEPLAATGTGDFNRQFLGQM 

TQLNQLLGEVKDLLRQEVNETSFLRNTITECQAC 

GPLKFQSPTPSTWPPASPAPPTRPPRRCDSNPCF 

RGVQCTDSRDGFQCGPCPEGYTGNGITCIDVDEC 

KYHPCYPGEHCINLSPGFRCDACPVGFTGPMVQ 

GVGISFAKSNKQVCTDIDECRNGACVPNSICVNT 

LGSYRCGPCKPGYTGDQIRGCKAERNCRNPELN 

PCSVNAQCEEERQGDVTCVCGVGWAGDGYICGK 

Dv^IDSYTDEELPCSARNCKKDNCKYVPNSGQE 

DADRDGIGDACDEDADGDGDLNEQDNCVLIHNV 

DQRNSDKDIFGDACDNCLSVLNNDQKDTDGDG 

KvjDACDDDMDGDGIK^ILDiSIC^ 

DGDGVGDACDSCPDVSNPNQ 


3709 


A 


144 . 


417 


TQAMEGLLHYINPAHAISLLSALNEERLKGQLCD 
VLLIVGDQKFRAHKNVLAASSEYFQSLFTNKENE 
SQTVFQLDFCEPDAFDNVLNYIY 


j 1 1U 


A 

A 


245 




FGMLKINKGriSSKKDNLAVNAVALQDHILHDLQ 

LRNLSVADHSKTQVQKKENKSLKRDTKAIIDTGL 

KKTTQCPKLEDSEKEYVLDPKPPPLTLAQKLGL1 

GPPPPPLSSDEWEKVKQRSLLQGDSVQPCPICKE 

EFELRPQVFSIRG 


3711 


A 


3 

v. 


773 


SLEMSSDGEPLSRMDSEDSISSTIMDVDSTISSGRS 
TPAMMNGQGSTTSSSKNIA YNCCWDQCQACFNS . 
SPDLADHIRSIHVDGQRGGVFVCLWKGCKVYNT 
PSTSQSWLQRHMLTHSGDKPFKCVVGGCNASFA 
SQGGLARHVPTHFSQQNSSKVSSQPKAKEESPSK 
AGMNKRRKLKNKRRRSLARPHDFFDAQTLDAIR 
HRAICFNLSAHIESLGKGHSVVFHSTVSILLFFQIK 
i K 1 .Lv^N lo 1 11SKSLKI 


3712 


A 


2 


344 .,. 


RATWHNAGKEREAVQLMAGAEKRVKASHSFLR 
GLFGGNTRIEEACEMYTRAANMFKMAKNWS AA 
GNAFCQAAKLHMQLQSKHDSATSFVDAGNAYK . 
KADPQGKTARHVACYLCV 


3713 . 


A 


20 


974 . 


G AAATACSS S S S S SG AP AT W A AHGPGKD V ASPS 

SVSLSPRRSRLLVLRCGLRRNPERPSSSPALRRLL 

LLLLLLLLLLLGFLLSPGPERGVGGGRFGRRLAL 

LWAAALGHWSGKVMSRRAPGSRLSSGGGGGG 

TNYSRSWNDWQPRTDSASADPGNLKYSSSRDRG- 

GSSSYGLQPSNSAWSRQIuiDDTRVHADIQNDE 

KGGYSVNGGSGENTYGRKSLGQELRVNNVTSPE 

FTSVQHGSRALATKDMRKSQERSMSYCDESRLS 

I LJLKKJ 1 RbN DKJU)KJ<JL AT V KQLKbr 1QQPENKLV 

LVKQLDILAAVHDVLNER 


3714 


A 


237 


458 


IFALKSPSYLLPCCTPEGKMDHKQLCWSHPQKSG 
QSSRSCCICSNQHGLIWKYSLNMCLQCCHQYVK 
DIGFIKL 


3715 


A 

A 




1 Jxn 


T PTT QPnTCr^TA r T QPT TTPPHTCT flTQP A fYMniTVIJ 
JHw 1 LroJrvJloLr i AuoLL 1 1 JDrU i JclAJ 1 or Al^JNvjr Y xi 

EAVVLFTQALKLNPQDHRLFGNRSFCHERLGQP 
AWALADAQVALTLRPGWPRGLFRLGKALMGLQ 
RFREAAAVFQETLRGGSQPDAARELRSCLLHLTL 
QGQRGGICAPPLSPGALQPLPHAELAPSGLPSLRC 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D^Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G-Clycine, H=Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, ' 
I— i nreonine, V— valine, \v=i ryptopban, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide Insertion 










PRSTALRSPGLSPLLH 


3716 


A. . 


85 


308 


QGLPSTMVKLGCSFSGKPGKDPGDQDGAAMDS 
VPLISPLDISQLQPPLPDQVVIKTQTEYQLS SPDQQ 
NYTKSR 


3717 


A 


58 


618 


GAGCTSPGLWARKAAARCLPTYPSRAQPSNVGR 

RRRRRPGLGALAAGVPAMAESVERLQQRVQELE 

RELAQERSLQVPRSGDGGGGRVRIEKMSSEWD 

SNPYSRLMALKRMGIVSDYEKIRTFAVAIVGVGG 

VGSVTAEMLTRCGIGKLLLFDYDKVELANMNRL 

FFQPHQAGLSKVQAAGHTPEE 


3718 


A 


3 


593 


RGAGGRAGGRADGQPNMADQRQRSLSTSGESL 

YHVLGLDKNATSDDIKKSYRKLALKYHPDKNPD 

NPEAADKFKEINNAHAILTDATKRNIYDKYGSLG . 

LYVAEQFGEElSrv T OTYFVLSSWWAI<ALFVFCGLL 

TCCYCCCCLCCCFNCCCGKCKPKAPEGEETEFY 

VSPEDLEAQLQSDEREATDTPIVIQPASATEP 


3719 


A 


2 


2173 


SGGVRMGSRADGPRTSGHVTGKMAVFPWHSRN 

RNYKAEFASCRLEAVPLEFGDYHPLKPITVTESK 

TKKVNRKGSTSSTSSSSSSSVVDPLSSVLDGTDPL 

SMFAATADPAALAAAMDSSRRKRDRDDNSVVG 

SDFEPWTNKRGEILARYTTTEKLSrKLFMGSEKG 

KAGTATLAMSEKVRTRLEELDDFEEGSQKELLN 

LTQQD YVNRIEELNQSLKDA WASDQKVKAPKN 

VHPGKLVYERIFSMCVDSRSVLPDHFSPENANDT 

AKETCLNWFFKIASIRELIPRF YVEASELKCNKFLS 

KTGISECLPRLTCMIRGIGDPL\GSVYARAYL\SRV 

GMEVAPHLKETLNKNFFDFLLTFKQIHGDTVQN 

QLWQGVELPSYLPLYPPAMDWIFQCISYHAPEA 

LLTEMMERCKKLGNNALLLNSVMSAFRAEFIAT 

RSMDFIGMIKECDESGFPKHLLFRSLGLNLALAD 

PPESDRLQELNEAWKV1TKLKNPQDYINCAEVWV 

EYTCKHFTKREVNTVLADVIKHMTPDRAFEDSY 

PQLQLIIBXVIAHFHDFSVLFSVEKFLPFLDMFQK 

ESVRVEVCKCI\RTTLSSINKSPPRTRSS*MPFCMF 

ARPCMTL/CNALTLEDEKRMLSYLINGFIKMVSF 

GRDFEQQLSFYVESRSMFCNLEPVLVQLIHSVNR 

LAMETRKVMKGNHSRKTAAFVRSWGAYWFITIP 

SLAGIFTRLNLYLHSG 


3720 


A 


24 


296 


ENLFRAGFAFSLLRSSFYISKTYCSWFSNLISGSL 

ADFNSKGTRDYSPRQMAVRE/KVFDVIIRCFKRH 

GAEVIDTPVFELKVRNGQEETTW 


3721 


A 


2 


310 


PSCLTCVGHCS1GGSCTMIGIMMPECHCSLHMTG 
PRCEEHVFUQQPGHIASILIPLLVLLLLALVAGVV 
FWHKRRVQGAKGFQHQRMTNGAMNVEIGNPTY 
K 


3722 


A 


75 


722 


MELV AGCYEQVLFGFAVHPEPEACGDHEQWTL 

VADFTHHAHTASLSAVAVNSRFVVTGSKDETIHI 

YDMKKKIEHGALVHHSGTITCLKFYGNRHLISGA 

EDGLICrWDAKKWECLKSIKAHKGQVTFLSIHPS 

GKLALSVGTDKTLRTWNLVEGRSAFIKNIKQNA 

xxi v Cf w or i\\JCy i v v li. v^rN iviJU 1 1 K^LtD l AoIoVj 1 1 J. IN 

EKRISSVKFLSES 


3723 


A 


110 


316 


MELSDNRRSGGLEGLAEKCPNLTYLNLSGNKIK 
DLSTVEALVSGTVLSLDLLFLVKFSEICLCLLISI 


3724 


A 


3 


406 


VDRGTEAWQRDPAFSGLQRVGGVDVSFVKGDS 
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SEQH> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alamnc OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I-Isoleucine, K«Lysine, D=Leucine, IWHMethionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
■^Threonine, V^Valine, W=Trypt°phan, Y=Tyrosinc, 
X=Un known, *«Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










VRACASLGVLSFPELEWYEESRMVSLTAPYVSG 
FLAFREVPFLLELVQQLREKEPGLMPQVLLVDGN 
GVLHHRGFGVACHLGVLTDLPCVGVAKKLLQV 
DG 




A 

A 


J 


406 


VDRGTEAWQRDPAFSGLQRVGGVDVSFVKGDS 

VRACASLGVLSFPELEWYEESRMVSLTAPYVSG 

FLAFREVPFLLELVQQLREKEPGLMPQVLLVDGN 

GVLHHRGFGVACHLGVLTDLPCVGVAKKLLQV 

DG 


5 /ZD 


A 

A 


1 


All 


SSDDRSLFRRLKLNYA1FDEGHMLKNMG SIRYQ 
HLMTINANNRLLLTGTPVQNNLLELMSLLNFVM 
PHMFSSSTSEIRRMFSSKTKSADEQSIYEKERIAH 
AKQIIKPFILRRVKEEVLKQLPPKKDRIELCAMSE - 
KQJEQLYLG 


3727 


A 


6 


383 


RIPRGKACXTVLGRSTGELEGFASSRLPPQPCGW 
GQSSDLLSRJDLDELMKKDEPPLDFPDTLEGFEY 
AFNEKGQLRHDCTGEPFVFNYREHLHRWNQKRY 
EALGEIITKYVYELLEKDCNSKKVS 


3728 


A 


3 . 


2452 " 


E1AGAAAENMLGSLLCLPGSGSVLLDPCTGSTISE 

TTSEAWSVEVLPSDSEAPDLKQEERLQELESCSG 

LGSTSDDTDVREVSSRPSTPGLSVVSGISATSEDIP 

NKIEDLRSECSSDFGGKDSVTSPDMDE1THDFLYI 

LQPKQHFQHIEAEADMRIQLSSSAHQLTSPPSQSE 

SLLAMFDPLSSHEGASAVVRPKVHYARPSHPPPD 

PPILEGAVGGNEARLPNFGSPMF*LPAEMEAFKQ 

RHS/YTPERLVRSRSS\DIVSSVRRPMSDPSWNRR 

P\GNEERELPPAAAIGATSLVAAPHSSSSSPSKDSS 

RGETEERXDSDDEKSDRNRPWWRKRFVSAMPK 

APIPFRKKEKQEKDKDDLGPDRFSTLTDDPSPRLS 

AQAQVAEDILDKYRNAIKRTSPSDGAMANYEST 

EVMGDGESAHDSPRDEALQNISADDLPDSASQA 

AHPQDSAFSYRDAKKKLRLALCSADSVAFPVLT\ 

HSTRNGLPDHTDPEDNEIVCFLKVQIAEAINLQD 

KNLMAQLQETMRCVCRFDNRTCRKLLASIAEDY 

RKRAPYIAYLTRCRQGLQTTQAHLERLLQRVLR 

DKEVANRYFTTVCVRLLLESKEICKIREFIQDFQK 

LTAADDKTAQVEDFLQFLYGAMAQDVIWQNAS 

EEQLQDAQLAIERSVMNRIFKLAFYPNQDGDILR 

DQVLHEHIQRLSKVVTANHRALQEPEVYLREAP 

WPSAQSEIRTISAYKTPRDKVQCILRMCSTIMNLL 

SLA>JEDSWGADDFVPVLVFVLIKANPPCLLSTV 

QYISSFYASCLSGEESYWWMQFTAAVEFIKT1DD 

RK 


3729 


A 


3 


2452 


EIAGAAAENMLGSLLCLPGSGSVLLDPCTGSTISE 

TTSEAWSVEVLPSDSEAPDLKQEERLQELESCSG 

LGSTSDDTDVREVSSRPSTPGLSWSGISATSEDIP 

NKJEDLRSECSSDFGGKDSVTSPDMDEITHDFLYI 

LQPKQHFQHIEAEADMRIQLSSSAHQLTSPPSQSE 

SLLAMFDPLSSHEGASAWRPKVHYARPSHPPPD 

PPILEGAVGGNEARLPNFGSPMF*LPAEMEAFKQ 

RHQ/VTPP'R T VT? QP <? C:\nTVQ QVP 1? PN/KinPQU/KTP P 
isxio/ I 1 rJ_/lvj-> V i\oi\oouJl V oo V jSJvrjVloJL/ro Wi\ tvK. 

P\GNEERELPPAAAIGATSLVAAPHSSSSSPSKDSS 
RGETEERKDSDDEKSDR>JRPWWRKRFVSAMPK 
APIPFRKKEKQEKDICDDLGPDRFSTLTDDPSPRLS 
AQAQVAEDILDKYRNA1KRTSPSDGAMANYEST 
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SEQID 
NO: 


Metbod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cystcine, D=*Aspartic Acid, 
E=Giutamic Acid, F=PhenyIaIanine, G^GIycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R-Arginine, S=Serine, 
T»Threonine, V«Valine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










EVMGDGESAHDSPRDEALQNISADDLPDSASQA 

AHPQDSAFSYRJDAKKKLRLALCSADSVAFPVLTA 

HSTRNGLPDHTDPEDNEIVCFLKVQIAEAINLQD 

KNLMAQLQETMRCVCRFDNRTCRKLLASIAEDY 

RKRAPY1AYLTRCRQGLQTTQAHLERLLQRVLR. 

DKEVANRYFTTVCVRLLLESKEKKIREFIQDFQK 

LTAADDKTAQVEDFLQFLYGAMAQDVIWQNAS 

EEQLQDAQLAIERSVMNR1FKLAFYPNQDGDILR 

DQVLHEHIQRLSKVVTANHRALQPEVYLREAP 

WPSAQSEIRTISAYKTPRDKVQCILRMCSTIMNLL 

SLANEDSVPGADDFVPVLVFVLIKANPPCLLSTV 

QYISSFYASCLSGEESYWWMQFTAAVEFIKTIDD 

RK 


3730 


A 


3 


2452 


EIAGAAAENMLGSLLCLPGSGSVLLDPCTGSTISE 
TTSEAWSVEVLPSDSEAPDLKQEERLQELESCSG 
LGSTSDDTDVREVSSRPSTPGLSVVSGISATSEDIP 
NKIEDLRSECSSDFGGKDSVTSPDMDEITHDFLYI 
LQPKQHFQHIEAEADMRIQLSSSAHQLTSPPSQSE 
SLLAMFDPLSSHEGASAVVRPKVHYARPSHPPPD . 
PPILEGAVGGNEARLPNFGSPMF*LPAEMEAFKQ 
. RHS/YTPERLVRSRSS\D1VSSVRRPMSDPSWNRR . 
P\GNEERELPPAAAIGATSLVAAPHSSSSSPSKDSS 
RGETEERKDSDDEKSDRNRPWWRKRFVSAMPK 
APIPFRKKEKQEKDKDDLGPDRFSTLTDDPSPRLS 
AQAQVAEDILDKYRNAIKRTSPSDGAMANYEST 
EVMGDGESAHDSPRDEALQNISADDLPDSASQA 
AHPQDSAFSYRDAKKKLRLALCSADSVAFPVLT\ 
HSTRNGLPDHTDPEDNEIVCFLKVQ1AEAINLQD 
KNLMAQLQETMRCVCRFDNRTCRKLLASIAEDY 
RKRAPYIAYLTRCRQGLQTTQAHLERLLQRVLR 
DKEVANRYFITVCVRLLLESKEKKIREFIQDFQK 
LTAADDKTAQVEDFLQFLYGAMAQDVIWQNAS 
EEQLQDAQLAIERSVMNRIFKLAFYPNQDGDILR 
DQVLHEHIQRLSKVVTANHRALQIPEVYLREAP 
WPSAQSEIRTISAYKTPRDKVQCDLRMCSTIMNLL 
SLANEDSVPGADDFVPVLVFVLIKANPPCLLSTV 
QYISSFYASCLSGEESYWWMQFTAAVEFIKTIDD 
RK 


3731 


A 


1 


1305 


VNTAMHEAKLMEECDELVEnQQRKQMIAVKIK 

ETKVMKLRKLAQQVANCRQCLERSTVLINQAEH 

ILKENDQARFLQSAKNIAERVAMATASSQVLPDI 

NFNDAFENPALDFSREKKLLEGLDYLTAPNPPSIR 

EELCTASHDTITVHWISDDEFSISSYELQYTIFTGQ 

ANFISLYMSVDSWMIVPNIKQNHYTVHGLQSGTR 

YIFIVKAINQAGSRNSEPTRLKTNSQPFKLDPKMT 

HKKLKISNDGLQMEKDESSLKKSHTPERFSGTGC 

YVYGVLHNSDNS*MFISLSFPLSHRYAIGIAYKSA 

PKNEWIGKNASSWVPSRCNSNF^VRHNNKEML 

VDVPPHLKRLGVLLDYDNY/NMLSFYDPANSLVH 

LHTFDVTF\TLPVCPTFTIWNKSLMILSGLPAPDFI 


3732 


A 


127 


2832 . 


LGQRLSLVPRPSLKRRLGKRLSLGLRERMMSLW 
. WS/GPKVRTQATTG ARPKTETKSVPAARPKTEAQ 
AMSGARPKTEVQVMGGARPKTEAQGITGARPKT 
DARAVGGARSKTDAKAIPGARPKDEAQAWAQS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F-Phenylalanine, G=Glycine, H=Histidine, 
l=IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P>=Proline, Q=Glutaminc, R=Arginine, S=Serine, 
T«Threonine, V«Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 










EFGTEAVSQAEGVSQTNAVAWPLATAESGSVTK 

SK\ACXWIEN*SMWM/PETFPGTQGQKGIQPWFG 

PGEETNMGSWCYSRPRAREEASNESGFWSADET 

STASSFWTGEETSVRSWPREESNTRSRHRAKHQT 

NPRSRPRSKQEAYVDSWSGSEDEASNPFSFWVG 

ENTONLFRPRVREEANIRSKLRTNREDCFESESED 

EFYKQSWVLPGEEANXTOSGTETKKILILPWICLRA 

QKDVDSDRVKQEPRFEEEVnGSWFWAEKEASLE 

GGASAICESEPGTEEGAIGGSAYWAEEKSSLGAV 

AREEAKjPESEEEAIFGSWFWDRDEACFDLNPCPV 

YKVSDRFRDAAEELNASSRPQTWDEVTVEFKPG 

LFHGVGFRSTSPFGIPEEASEMLEAKPKNLELSPE 

GEEQESLLQPDQPSPEFTFQYDPSYRSVREIREHL 

RARESAESESWSCSCIQCELKIGSEEFEEFLLLMD 

KJRDPFIHEISKIAMGMRSASQFTRDFIRDSGVVS 

LIETLLNYPSSRVRTSFLENMIHMAPPYPNLNMIE 

TFICQVCEETLAHSVDSLEQLTGNKGCFRHLTMT 

n)YHlALIAN*YGPGFPLLF*PQAQCGETKFHVLK 

N^LNL'SENPAVAKKLFSAKALSIFVGLFNIEETN 

DNIQIVIKMFQNI SNIIKSGKMSLIDDDFSLEPLISA 

FREFEELAKQLQAQIDNQNDPEATGTTAFVGKG 

NNPSANRERLSPSVFCPGAQEAESLPARRVRGEE 

QRLLLEEVG ARTADGIPEG W 


3733 


A 


2 


3274 


DVPLIRJEEDTGEIFTTGARIDREKLCAGIPRDEHC 

FYEVEVAILPDEIFRLVKIRFLIEDINDNAPLFPAT . 

VIN1SIPENSAINSKYTLPAAVDPDVGINGVQNYE 

LIKSQNIFGLDVEETPGGDKMPQLIVQKELDREEK 

DTYVMKVKVEDGGFPQRSSTAELQVSVTDTNDN 

HPVFKETEEEVSIPENAPVGTSVTQLHATDADIGE . 

NAKIHFSFSNLVSNIARRLFHLNATTGLITIKEPLD 

REETPNHKLL VL A SDGGLMP ARAMVL VNVTD V 

ITONVPSroiRYIVNPVNDTVVLSENIPLNTKIALIT 

VTDKDADHNGRVTCFTDHEIPFRLRPVFSNQFLL 

ETAAYLDYESTKEYAIKLLA\ADAGKPPLNQSAM 

LFIKVKDENDNAPVFTQSFVTVSIPENNSPGIQLT 

KVSAMDADSGPNAKINYLLGPDAPPEFSLDCRT 

GMLTVVKiaDREKEDKYLFTILAKDNGVPPLTS 

NVTWVSIIDQNDNSPVFTHNEYNFYVPENLPRH 

GTVGLITVTDPDYGDNSAVTLSILDENDDFTEDSQ 

.TGVIRPNISFDREKQESYTFYVKAEDGGRVSRSSS 

AKVTINVVDVNDNKPVFIVPPSNCSYELVLPSTN 

PGTVVFQVIAVDNDTGMNAEVRYSIVGGNTRDL 

FAIDQETGNITLMEKCDVTOLGLHRVLVKANDL 

GQPDSLFSVVIVNLFVNESVTNATLINELVPQKH 

LKHQ*PQILEIADVSSPTSDYVKILVAAVAGTITV 

WVinTAVVRCRQAPHLKAAQKNMQNSEWATP 

NPENRQMIMMKKKKKXKKHSPK^ 

TKADDVDSDGNRVTLDLPIDLEEQTMGKYNWV 

TTPTTFKPDSPDLARHYKSASPQPAFQIQPETPLN 

LKHHIIQELPLDNTFVACDSISNCSSSSSDPYSVSD 

OVJlrv 1 irHVr Vo Vri IKrJr V 1JL.I1 V O vj A V^oVj^VAJ 

LTSSLMELLLCLMVAAFLPLELRPLGQQNVMSW 
EQEAKELLVGYWGDGEWCHFHFHHLIPGPVNPG 
YERKQYHILDSDSEDTQPSGELCPIPVRPFTILSIQ 
LLQDDGEHCGTKQGFQPAVQLGLLPHKTLK 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location, 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysteine, D^Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G=G Jycine, H^Histidine, 
I=Isolcucine, KpLysine, I^Leucine, M=Methionine, 
N^Asparagine, P=Proline, Q=Glutamine, R=Argininc, S^Serinc, 
T=Threoninc, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 


3734 


A 


1 


840 


GTRPGHLPAPSDGFCV/HL* SIPS WGSF*GESL/EM 

QLrrSLGLQEFDLARNVLELIYAQTLVWIGIFFCPL 

LPHQMIN1LFIMFYSKMSLMMNFQPPSKAWRAS 

QMMTFFIFLLFFPSFTGVLCTLAITnVRLKPSADC 

GPFRGLPLFIHSrySWIDTLSTR^ 

LIGSVHFFFILTLIVLnTYLYWQITEGRKIMrRLLH 

EQIINEGKDKMFLIEKLIKL QDMEKKANP S S L VLE 

RREVEQQGFLHLGEHDGSLDLRSRRSVQEGNPR 

A 




A 




432 


VEVCRRYLWKMTVDASQNVQCCVIFSHFPFIFN 
NLSKIKLLHTDTLLKIESKKHKA YLRSAAIEEERE 
SEFALRPTFDLTVRRNHLIEDVLNQLSQFENEDL 
RKELWVSFSGEIGYDLGGSA^KKEIFYCLFAEMIQ 
PEYGMFMY 


3736 


A 


1542 


343 


KGAPSFVJRLYQYPNFAGPHAALANKSFFKADKV 

TMLWMCKATAVLVIASTDVDKTGASYYGEQTL 

HYIATNGESAWQLPKNGPIYDWWNSSSTEFCA 

VYGFMPAKATIFNLKCDPVFDFGTGPRNAAYYS 

PHGHILVLAGFGNLELQI*AD/IMKVWNVKNYKLI . 

SKPVASDSTYFAWCPDGEHILTATCAPRLRVNN 

GYKIWHYTGSILHKYDVPSNAELWQVSWQPFLD 

GIFPAKTITYQAVPSEVPNEEPKVATAYRPPALRN 

KPITNSKLHEEEPPQNMKPQSGNDKPLSKTALKN 

QRKHEAKKAAKQEARSDKSPDLAPTPAPQSTPR 

NTVSQSISGDPEIDKKIKNLKKKLKAIEQLKEQAA 

TGKQLEKNQLEKIQKETALLQELEDLELGI 


3737 


A 


3190 


664. . 


• VAMGTPRAQHPPPPQLLFLILLSCPWIQGLPLKEE 
EILPEPGSETPTVASEALAELLHGALLRRGPEMG 
YLPGPPLGPEGGEEE1T1T1ITTTTVTTWTSPVLC 
NNNISEGEGYVESPDLGSPVSRTLGLLDCTYSIHV 
YPGYGIEIQVQTI,NLSQEEELLVLAGGGSPGLAP 
RLLANSSMLGEGQVLRSPTNRLLLHFQSPRVPRG 
GGFRIHYQAYLLSCGFPPRPAHGDVSVTDLHPGG 
TATFHCDSGYQLQGEETLICLNGTRPSWNGETPS 
CMASCGGTmNATLGRIVSPEPGGAVGPNLTCR 
WVIEAAEGRRLHLHFERVSLDEDNDRLMVRSGG 
SPLSPVIYDSDMDDVPERGLISDAQSLYVELLSET 
PANPLLLSLRFEAFEEDRCFAPFLAHGNVTTTDPE 
YRPGALATFSCLPGYALEPPGPPNAIECVDPTEPH 
WNDTEPACKAMCGGELSEPAGWLSPDWPQSY 
SPGQDCVWGVHVQEEKRJLLQVErLNVREGDML 
TLFDGDGPSARVLAQLRGPQPRRRLLSSGPDLTL 
. QFQAPPGPPNPGLGQGFVLHFKEVPRNDTCPELP 
PPEWGWRTASHGDLIRGTVLTYQCEPGYELLGS 
DELTCQWDLSWSAAPPACQKIMTCADPGEIANG 
HRTASDAGFPVGSHVQYRCLPGYSLEGAAMLTC 
YSRDTGTPKWSDRWKCALKYEPCLNPGVPENG 
YQTLYKHHYQAGESLRFFCYEGFELIGEVTITCV 
PGHPSQWTSQPPLCKVTQTTDPSRQLEGG>JLAL 
ADLLPLGLVTVLGSGVYIYYTKJLQGKSLFGFSGSH 
SYSPITVESDFSNPLYEAGDTRFYFV9T 


3738 


A 


3190 


664 


VAMGTPRAQHPPPPQLLFLILLSCPWIQGLPLKEE 
EILPEPGSETPTVASEALAELLHGALLRRGPEMG 
YLPGPPLGPEGGEEETTTTIITTTTVTTTVTSPVLC 
KhJNISEGEGYVESPDLGSPVSRTLGLLDCTYSIHV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
. corresponding 
to last amino 
acid residue of 
peptide , 
sequence 


Amino acid sequence (A«AIanine C=Cysteine, D=Aspartic Acid, 
£=Giutamic Acid, ^Phenylalanine, OGIycine, H=Histidine, 
Msoleucine, K=Lysine, ^Leucine, MNMethionine, 
N=Asparagine, P*=ProIine, Q^Glutamine, R~Arginine, S=Serine, 
T-Threonine, V=Valine, W«=Tryptophan, Y-Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










YPGYGIEIQVQTLNLSQEEELLVLAGGGSPGLAP 

RLLANSSMLGEGQVLRSPTNRLLLHFQSPRVPRG 

GGFRJHYQAYLLSGGFPPRPAHGDVSVTDLHPGG 

TATFHCDSGYQLQGEETLICLNGTRPSWNGETPS 

CMASCGGTIHNATLGRIVSPEPGGAVGPNLTCR 

WIEAAEGRRLHLHFERVSLDEDNDRLMVRSGG 

SPLSPVIYDSDMDDVPERGLISDAQSLYVELLSET 

PANPLLLSLRraAFEEDRCFAPFLAHGWTTTDPE 

YRPGALATFSCLPGYALEPPGPPNAffiCVDPTEPH 

WNDTEPACKAMCGGELSEPAGVVLSPDWPQSY 

SPGQDCVWGVHVQEEKRILLQVEILNVREGDML 

TLFDGDGPSARVLAQLRGPQPRRRLLSSGPDLTL 

QFQAPPGPPNPGLGQGFVLHFKEVPRNfDTCPELP 

PPEWGWRTASHGDLIRGTVLTYQCEPGYELLGS 

DILTCQWDLSWSAAPPACQKIMTCADPGEIANG 

HRTASDAGFPVGSHVQYRCLPG YSLEGAAMLTC 

YSRDTGTPKWSDRVPKCALKYEPCLNPGVPENG 

YQTLYKHHYQAGESLRFFCYEGFELIGEVTITCV 

PGHPSQWTSQPPLCKVTQTTDPSRQLEGGNLAL 

AILLPLGLVIVLGSGVYIYYTKLQGKSLFGFSGSH 

SYSPITVESDFSNPLYEAGDTREYEVSI 


3739 


A - .- 


734 


445 


LLEPEP AEEYTEQSEVEST/EGMILI* CCLYFAAFQ 
TNVSNIYFALQYVNRQFMAETQFTSGEKEQVDE 
WTVETVEVRVLCIAKLLSLSSVSNFYLY 


3740 


A 


2 


1578 


MAHYITFLCMVLVLLLQNSVLAEDGEVRSSCRT 

APTDLVFILDGSYSVGPENFEIVKXWLVNITKNF 

DIGPKFIQVGVVQYSDYPVLEIPLGSYDSGEHLTA 

AVESILYLGGNTKTGKAIQFALDYLFAKSSRFLT 

KIAWLTDGKSQDDVKDAAQAARDSKITLFAIG 

VGSETEDAELRAIANKPSSTYVFYVEDYIAISKIR 

EVMKQKLCEESVCPTRIPVAARDERGFDILLGLD 

VNKKXHKJCRIQLSPKKIKGYEVTSKVDLSELTSNV 

FPEGLPPSYWVSTQRI^VKKIWDLWRILTIDG/* 

PQIAVTLNGVDKILLFTTTSVINGSQVVTFANPQV 

KTLFDEG WHQIRLLVTEQDVTLYIDDQQIENKPL 

HPVLGILINGQTQIGKYSGKEETVQFDVQKLRIY 

CDPEQNNRETACEIPGFCLNGPSDVGSTPAPCICP 

PGKPGLQGPKGDPGLPGNPGYPGQPGQDGKPVS 

TESLVISGISGITGYQGIAGTPGVPGSPGIQGARGL- 

PGYKGEPGRDGDK 


3741 


A 


5048 . 


1236 


MSAPAGSSHPAASARIPPKFGGSAVSGAAAPAGP 

GAGPAPHQQNGPAQNQMQVPSGYGLHHQNYIA 

PSGHYSQGPGKMTSLPLDTQCGDYYSALYTVPT 

QNVTPNTVNQQPGAQQLYSRGPPAPfflVGSTLGS 

FQGAASSASHLHTSASQPYSSFVNHYNSPAMYS 

ASSSVASQGFPSTCGHYAMSTVSNAAYPSVSYPS 

LPAGDTYGQMFTSQNAPTVRPVKDNSFSGQNTA 

ISHPSPLPPLPSQQHHQQQSLSGYSTLTWSSPGLP 

STQDISnLIRNHTGSLAVANNNPTITVADSLSCPVM 

QNVQPPKSSPWSTVLSGSSGSSSTRTPPTANHPV 

EPVTSVTQPSELLQQKGVQYGEYVNNQASSAPT 

PLSSTSDDEEEEEEDEEAGVDSSSTTSSASPMPNS 

YDALEGGSYPDMLSSSASSPAPDPAPEPDPASAP 

APASAPAPVVPQPSKMAKPLAMAIQHFSLVIRML 

QHHLFLEYSPSNPVYSGFQQYPQQYPGVNQLSSS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, B=Histidinc, 
I~IsoIeucine, K=Lysine, L=Lcucine, M=Mcthionine, 
N«Asparaginc, P^Proline, <>=Glutaroine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *>=Stop codon, /=possible nucleotide deletion, 
.^possible nucleotide insertion 










IGGLSLQSSPQPESLRPVNLTQERNILPMTPVWAP 
VPNLNADLKKLNCSPDSFRCTLTOPQTQALLNK 
AKLPLGLLLHPFRDLTQLPVITSNTIVRCRSCRTYI 
NPWSFIDQRR*KCNLCYRVNDVPEEFMYNPLT 
RSYGEPHKRPEVQNS\TVEFIASSDYMLRPPQPAV 
YLFVLDVSHNAVEAGYLTI/LWCQSLLE\NLDKLP 
G\DSR'ARIGFMTFD\STYSFLQFTQEGLSQPQMLI 
VSDIDDVFLPTPDSLLVNLYESKELIKDLLNALPN 
MFTNTRETHSALGPALQAAFICLMSPTGGRVSVF 
QTQLPSLGAGLLQSREDPNQRSSTKVVQHLGPAT 
DFYKKLALDCSGQQTAVDLFLLSSQYSDLASLA 
CMSKYSAGC1YYYPSFHYTHNPSQAEKLQKDLK 
RYLTRKIGFEAVMRIRCTKGLSMHTFHGNFFVRS 
TDLLSLANINPDAGFAVQLSIEESLTDTSLVCFQT 
ALLYTSSKGERRIRVHTLCLPWSSLSDVYAGVD 
VQAAICLLANMAVDRSVSSSLSDARDALVNAW 
. DSLSAYGSTVSNLQHSALMAPSSLKLFPLYVLAL 
LKQKAFRTGTSTRLDDRVYAMCQIKSQPLVHLM 
KMIHPNLYRIDRLTDEGAVHVNDRIVPQPPLQKL 
SAEKLTREGAFLMDCGSVFYIWVGKGCDNNFIE 
DVLGYTNFASIPQKMTHLPELDTLSSERARSFIT 
WLRDSRPLSPILHIVKDESPAKAEFFQHLIEDRTE 
AAFSYYEFLLHVQQQICK 


3742 


A 


934 


68 


SMLASQGVLLHPYGVPMTVPAAPYLPGLIQGNQE 

AAAAPDTMAQPYASAQFAPPQNGIPAEYTAPHP 

HPAPEYTGQTTVPEHTLNLYPPAQTHSEQSPADT 

SAQTVSGTRNKQD*RSTDGWPSPKTQTS*KHGK 

QVSSPSGLHVSNIPFR\FRDPDLRQMF\GQFGKILD 

VE1IFNERG SKGFGFVTFENSAD ADRAREK\LHGT 

VV\EGRKI^VN\NATARVMTNKXTVNPYTNGWK 

LNPVVGAVYSPEFYAGTVLLCQANQEGSSMYSA 

PSTDFRGAKLHTSRPLLSGS 


3743 . 


A 


3 


1456 


QFQQAWMQNKVPIPAPNEVLNDRXEDIKLEEKK 

KTQAEIEQEMATLQYTNPQLLEQLKIERLAQKQV 

EQIQPPPSSGTPLLGPQPFPGQGPMSQIPQGF/PTA 

PSISADANEHGS\KGPPGPQGQFRPPGPQGQMGP 

QGPPLHQGGGGPQGFMGPQGPQGPPQGLPRPQD 

MHGPQGMQRHPGPHGPLGPQGPPGPQGSSGPQG 

HMGPQGPPGPQGHIGPQGPPGPQGHLGPQGPPGT 

QGMQGPPGPRGMQGPPHPHGIQGGPGSQGIQGP 

VSQGPLMGLNPKGMQGPPGPRENQGPAPQGMI 

MGHPPQEMRGPHPPGGLLGHGPQEMRGPQEIRG 

MQGPPPQGSMLGPPQELRGPPGSQSQQGPPQGSL 

GPPPQGGMQGPPGPQGQQNPARGPHPSQGPIPFQ 

QQKTPLLGDGPRAPFNQEGQSTGPPPLIPGLGQQ 

GAQGRIPPLNPGQGPGPNKVS/ERGAPPRHEGRA 

PPRGRDGFPGPMKTLV 


3744 


A 


1571 


652 


pltgrkcpgwthsgsrrspriaeevpgfpkraea 

srqfsetadrlellrravmaaarattpadgeep 

apeaealaaarerssrflsglelvkqgaearvfr 

grfqgpXavikhrfpkgyrhpalearlgrrrtv 

qearallrcrragisapvvffvdyasnclymeei 

egsvtvrd\ifsplwrlkktpqglsnlaktigqvl 

armhdedlihgdlttsnmllkppleqlnivlidf 

glsfisalpedkgvdlyvlekaflsthpntetvfe 
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SEQD> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to First amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Pbenylalanine, G^GIycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Lcucine, M=Mcthionine, 
N«Asparagine, P^ProIine, Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V^Vaiine, W=Tryptophan, Y=Tyrosine, 
X = Unknown, *=Stop codon, /^possible nucleotide deletion 
V=possible nucleotide insertion 










AFLKSYSTSSKKARPVLKKLDEVRLRGKKRSMV 
G 


3745 


A 


127 


1433 


GSHRFSLASPLDPEVGPYCDTPTMRTLFNLLWLA 
LACSPVHTTLSKSDAKXAASKTLLEKSQFSDKPV 
QDRGLVVTDLKAESVVLEHRSYCSAKARDRHFA 
GDVLGYVTPWNSHGYDVTKVFGSKFTQISPVWL 
QLKRRGREMFEVTGLHDVDQGWMRAVRKHAK 
' GL\P*CLGSCLRTGLTMISGA r VLDSEDEIEELSKT 
WQVAICNQHFDGFWEVWNQLLSQKRVGLIHM 
LTH1AEALHQARLLALLVIPPAITPGTDQLGMFT 
HKEFEQLAPVLDGFSLMTYDYSTAHQPGPNAPL 
SWVRACVQVLDPKSKWRSKILLGLNFYGMDYA 
TSKDAREPWGARYIQTLKDHRPRMVWDSQVSE 
HFFEYKKSRSGRHVVFYPTLKSLQVRLELARELG 
VGVSIWELGQGLDYFYDLL*VGIAASAVDVFFSK 
PWSE 


3746 


A 


1 • 


898 


IDRAAECRTKPLPMAVSIRGNADSIVACLVLMVL 

YLIKKRLVACAAVFYGFAVHMKIYPETYILPITL 

HLLPDRDNDKSLRQFRYTFQACL*ELLKRLCNRT 

ALMFVAVAGLTFFALSFGFYYEYGWEFLEHTYF 

YHLTRRDERHNFSPYFYMLYLTAESKWSFSLGIA 

AFLPQLILLSAVSFAYYRDLVFCWFLHTSIFVTFN 

KVCTSQ YFL WYLCLLPLVMPLVRMP WKRA WL 

LMLWFIGQAMWLAPAYVLEFQGKNTFLFIWLA 

GLFFLLINCSILIQIISHYKEEPLTERIKYD 


.3747 . 


A 


1. 


2325 \ 


MVISFQGLVTFGDVAVDFSQEEWEWLNPIQRNL 
YRKVMLENYRNLASLGLCVSKPDVISSLEQGKEP 
WTVKRKMTRA WCPDLKA VWKIKELPLKKDFCE 
GKLSQA VITERLTS YNLEYSLLGEHWDYDALFET 
QPGLVTIKNLAVDFRQQLHPAQKNFCKNGIWEN 
NSDLGSAGHCVAKPDLVSLLEQEKEPWMVKREL 
. TGSLFSGQRSVHETQELFPKQDSYAEGVTDRTSN 
TKLDCSSFRENWDSDYVFGRKLAVGQETQFRQE 
PITHNKTLSKERERTTNKSGRWFYLDDSEEKVH 
NRDSIKNFQKSSVVIKQTGIYAGKKLFKCNECKK 
TFTQSSSLTVHQRIHTGEICPYKCNECGKAFSDGS 
SFARHQRCHTGKKPYECIECGKAFIQNTSLIRHW 
RYYHTGEKPFDCIDGGKAFSDHIGLNQHRRIHTG 
EKPYKCDVCHKSF\RYGSSLTVHQRIHTGEKPYE 
CDVCRKAFSHHASLT\Q\HQRVHSGEKPFKCKEC 
GKAFRQNIHLASHLRIHTGEKPFECAECGKSFSIS 
SQLATHQRIHTGEKPYECKVCSKAFTQKAHLAQ 
HQKTHTGEKPYECKECGKAFSQTTHLIQHQRVH 
TGEKPYKCMECGKAFGDNSSCTQHQRLHTGQRP 
YECIECGKAFKTKSSLICHRRSHTGEKPYECSVC 
GKAFSHRQSLSVHQRIHSGKKPYECKECRKTFIQI 
GHLNQHKRVHTGERSYNYKKSRKVFRQTAHLA 
HHQRIHTGESSTCPSLPSTSNPVDLFPBCFLWNPSS 
LPSP 


3748 


A 


823 


1 

■ i 
y 


GGYTKSGYDSACKDFVPHDLEVQIPGRVFLVTG "~ 

cr^l CVW a T A T tjt a t/"d cifyrxrvn \T/^T>T\r\ a n a rn a 
OiNoUlLrls.Al AJUfcJLAJvrvUtj J VHLrVCKDtJAPAbDA 

RGEIIRE\'SGNQNIFLHWDLSDPKKIWKFVENFKQ 
EHKLHVLVVTsTMAGCMVNKREAHKKMDFE 
CQYSGVCTFLTTRPDPLCWRKNTDPRVIT\VSSG 
GMLVQKLNNQ*SPVRKNTIWMGTMVYAQNKVS 
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SEQDO 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isolcucine, K«Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
1>Thrconine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, **=Stop codon, /=possible nucleotide deletion, 
^possible nucleotide insertion 










ERQQVVLI\ERWGPRAPG\IHFSSMHPGWA\DTPG 
VRQAMPGFHVQASGYRLRSEAQGADTMLWLAL 
SSARSRTAQRP 


3749 


A 


1939 


715 


GFLRLSQATJIQRLSIPVMVLTLDPTRD\QCFGDR 

FSRLLLDEFLGYDDIL\MSS VKGLAENEENKGFLR 

NWSGEHYRFV\SMWMART\SYLAAFANHGQSF 

TLSVSHACCGYSHHQIFVFIVDLLQMLEMNMAIA 

FPAAPLLTVILALVGMEAIMSEFFNDTTTAFYnLI 

VWLADQYDAICCHTSTSKRHWLRFFYLYHFAFY 

AYHYRFNGQYSSLALVTSWLFIQHSMIYFFHHYE 

LPAILQHVRIQ\EMLLQAPTLGPGTPTA\LPDDMN 

IWSGAPATAP\DSAGQPPALGPVSPGASGSPGPV 

AAAPSSLVAAAASVAAAAGGDLGWMAETAA1IT 

DASFLSGLSASLLERRPASPLGPAGGLPHAPQDS 

VPPSDSAASDTTPLGAAVGGPSPASMAPTEAPSE 

VGS 


3750 


A 


2 


844 


GLLEPFSKLLSFVIQNAVFTLAYLVELCGLCYRA 

FTKERDKFYLSRSWLELLQALKLKSPLPDTNLL 

LLVQFICADAGTKLAESTILSKQM1ASVPGCGTA 

AMECVRQYIOTVLDFM\ADMHTLTia.KSHMKTC 

SQPLHEDTFGGHLKVGLAQMA1VIDISRGWIRDN 

KAVIRYLPWLYHPPS AMQQGPKEF1ECVSHIRLL 

SWLLLGSLTHNAVC/LKWPPLPGLPIPLDAGSHV . 

ADHLIVn^IGFPEQSKTSVL\HMCSLFHARSLAQL 

WDSLLARQSGRW 


i /->! 


A 

A • 


431 


2 


AFTRKCEETAF1YPQCEIIPTE/WVCRRJPTGSSLER 

NPGVKEGCEFCPPKVEMFFKDDANHDPQWSRQ 

QLIAAKFGFAALGyQTEVDIMSHAT*AVFEEPEKS 

RL\PQNCTPVDMICIEFGVHVTSKEILTDV1DNDS* 

RHSPS 


3752 


A 


131 


1278 


AWSGSGLLVLCINTASMPMISVLGKMFLWQREG 

PGGRWTCQTSRRVSSDPAWAVEWIELPRGLSLSS 

LGSARTLRGWSRSSRPSSVDSQDLPEVNVGDTV 

AMLPKSRRALTIQEIAALARSSLHGISQVVKDHV 

TKPTAMAQGRVAHLIEWKGWSKPSDSPAALESA 

FSSYSDLSEGEQEARFAAGVAEQFAIAEAKLRA 

WSSVDGEDSTDDSYDEDFAGGMDTDMAGQLPL 

GPHLQDLFTGHRFSRPVRQGSVEPESDCSQTVSP . 

DTLCSSLCSLEDGLLGSPARLA\PSCWAMSCFSPN 

CPPAGKVPSAAW/APLEAQDSLYNSPLTESCLSP 

AEEEPAPCKDCQPLCPPLTGSWERQRQASDLASS 

GWSLDEDEAEPEEQ 


3753 


A 


3 


1138 


YYSS VRQRVTCEEPRFRECAA ALIEGS ATEV Y AG 

EWRADRRSGFGVSQRSNGLRYEGEWLGNRRHG 

YGRTTRPDGSREEGKYKRNRLVHGGRVRSLLPL 

ALRRGKVKEKVDRAVEGARRAVSAARQRQEIA 

AARAADALLKAVAASSVAEKAVEAARMAKLIA 

QDLQPMLEAPGRRPRQDSEGSDTEPLDEDSPGV 

YENGLTPSEGSPELPSSPASSRQPWRPPACRSPLP 

PGGDQGPFSSPKAWPEEWGGAGAQAEELAGYE 

AFnFAfVMOGPfTPRrifi^PT T r^r^nQQflCiT ppt?!? 

GEDEEPLPPLRAPAGTEPEPIAMLVLRGSSSRGPD 
AGCLTEELGEPAATERPAQPGAANPLWGAVAL 
LDLSLAFLFSQLLT 


3754 


A 


2 


3338 


SSLLEKMTSSDKDFRFMATSDLIVISELQKDSIQLD 
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SEQID 
NO: 


Method 

) ■ 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCysteine, D=Aspartic Acid, 
JE=Glutamic Acid, F^Phenylalanine, G=GIyciae, H-Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Metbionine, 
N=A$paragine, P=ProHne, Q=Glutamine,R=Argtnine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
\=possiblc nucleotide insertion 










EDSERKWKMLLRLLEDKNGEVQNLAVKWLGV 
PLGAFHASLLHCLLPQLSSPRLAVRKRAVG ALGH 
LATACSTDLFVELADHLLDRLPGPRVPTSPTAIRT 
LIQCLGSVGRQAGHRLGAHLDRLVPLVEDFCNL 
DDDELRESCLQAFEAFLRKCPKEMGPHVPNVTS 
LCLQYIKHDPNYNYDSDEDEEQMETEDSEFSEQE 
SEDEYSDDDDMSWKVRRAAAKCIAALISSRPDL 
LPDFHCTLAPVLIRRFKEREENVKADVFTAYIVL 
LRQTTU>PKGWLEAMEEPTQTGSNLHMLRGQVPL 
VVKALQRQLKDRSVRARQGCFSLLTELAGVLPG 
SLAEHMPVLVSGIIFSLADRSSSSTIRMDALAFLQ 
GLLGTEPAEAFHPHLPELLPPVMACVADSFYKIA 
AEALWLQELVRALWPLHRPRMLDPEPYVGEMS 
AVTLARLRATDLDQEVKERAISCMGHLVGHLGD 
RLGDDLJEPTLLLLLDRLRNEITRLPAIKALTLVAV 
SPLQLDLQPILAEALHDLASFLRKNQRALRLATLA 
ALDALAQSQGLSLPPSAVQAVLAELPALVNESD 
MHVAQLAVDFLATVTQAQPASLVEVSGPVLSEL 
LRLLRSPLLPAGVLAAAEGFLQALVGTRPPCVDY 
AKLISLLTAPVYEQAVDGGPGLHKQVFHSLARC 
VAALSAACPQVEAESTASRLVCDARSPHSSTGVK 
VLAFLSLAEVGQVAGPGHERELKAVLLEALGSPS 
EDVRAAASYALGRVGAGSLPDFLPFLLEQIEAEP 
RRQYLLLHSLKEALGAAQPDSLKPYAEDIWALL 
FQRCEGAEEGTRGWAECIGKLVLVNPSFLLPRL 
RKQLAAGRPHTRSTVITAVKFLISDQPHPIDPLLK 
SFIAVHM<PSLVRDLXDD]LPLLYQETKIRRDLIRE 
, VEMGPITGEITVDDGLDVRKAAFECMYSLLESCLG 
QLDICEFLNHVEDGLKDHYDIRMLTFIMVARLAT 
LCPAPVLQRVDRLIEPLRATCTAKVKAGS VKQEF 
EKQDELKRSAMRAVAALLTIPEVGKSPIMADFSS 
QIRSNPELAALFESIQKDSTSAPSTDSMELS 


3755 


A 


2 


3338 


SSLLEKMTSSDKDFRFMATSDLMSELQKDSIQLD 

EDSERKVVKMLLRLLEDKNGEVQNLAVKWLGV 

PLGAFHASLLHCLLPQLSSPRLAVRKRAVG ALGH 

LATACSTDLFVELADHLLDRLPGPRVPTSPTAIRT 

LIQCLGSVGRQAGHRLGAHLDRLVPLVEDFCNL 

DDDELRESCLQAFEAFLRKCPKEMGPHVPNVTS . 

LCLQYIKHDPNYNYDSDEDEEQMETEDSEFSEQE 

SEDEYSDDDDMSWKVRRAAAKCIAALISSRPDL 

LPDFHCTLAPVLIRRFKEREENVKADVFTAYIVL 

LRQTRPPKGWLEAMEEPTQTGSNLHMLRGQVPL 

VVKALQRQLKDRSVRARQGCFSLLTELAGVLPG 

SLAEHMPVLVSGIIFSLADRS SSSTIRMDALAFLQ 

GLLGTEPAEAFHPHLPILLPPVMACVADSFYKIA 

AEALWLQELVRALWPLHRPRMLDPEPYVGEMS 

AVTLARLRATDLDQEVKERAISCMGHLVGHLGD 

RLGDDLEPTLLLLLDRLR2<nEITRLPAlKALTLVAV 

SPLQLDLQPILAEALHILASFLRKNQRALRLATLA 

ALDALAQSQGLSLPPSAVQAVLAELPALVNESD 

MHVAOT avdpt atvtoaopa^t vpv<:f;pvT CPT 

lvixi V r\\£Ljr\. V urj~>t\i V 1 \^/y\£rr\.OLs V HV OVJ" V JuaCLr 

LRLLRSPLLPAGVLAAAEGFLQALVGTRPPCVDY 
AKLISLLTAPVYEQAVDGGPGLHKQVFHSLARC 
VAALSAACPQ\EAESTASRLVCDARSPHSSTGVK 
VLAFLSLAEVGQVAGPGHERELKAVLLEALGSPS 
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SEQ U> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
. peptide 
sequence 


Amino acid sequence (A=AJanine C-Cyste'me^ D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H-Histidine, 
I^Isoleucine, KHLysine, I>Leucine, M^Methionine, 
N=Asparagine,P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y«Tyrosine, 
X^Unknown, *«=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










EDVRAAASYALGRVGAGSLPDFLPFLLEQIEAEP 

RRQYLLLHSLKEALGAAQPDSLKPYAEDIWALL 

FQRCEGAEEGTRGWAECIGKLVLVNPSFLLPRL 

RKQLAAGRPHTRSTVITAVKFL1SDQPHPIDPLLK 

SFIAVHNKPSLViyi>LLDDILPLLYQETKIRRDLIRE 

VEMGPFKHTVDDGLDVRKAAFECMYSLLESCLG 

QLDICEPTLNHVEIXjLKDHYDIRMLTFIMVARL 

LCPAPVLQRVDRLDEPLRATCTAKVKAGSVKQEF 

EKQDELKRSAMRAVAALLTEPEVGKSPIMADFSS 

QIRSNPELAALFESrQKDSTSAPSTDSMELS 


3756 


A 


112 


1361 


SLEEQQGRHPSFAPKCASQELGRIMITLITEQLQK 
QTLDELKCTRFSISLPLPDHADISNCGNSFQLVSE 
GASWRGLPHCSCAEFQ/DQPQLQLPSLRPEPAPQ 
TAHRGMSPKEQPFSQVLRPEPPDPEKLPVPPAPPS 
KRHCRSLSVPVDLSRWQPVWRPAPSKLWTPIKH 
RGSGGGGGPQVPHQSPPKRVSSL/SVPPSSQCLFS 
MCPSSHTLQPSFLQPGPGPXDSSRPCAASPQSGSW 
ESDAESLSPCPPQRRFSLSPSLGPQASRFLPSARSS 
PASSPELPWRPRGLRNLPRSRSQPCDLDARKTGV 
KRRHEEDPRRLRPSLDFDKMNQKPYSGGLCLQE 
TAREGSSISPPWFMACSPPPLSASCSPTGGSSQVL 
SESEEEEEGAVRWGRQALSKRTLCQRDFGDLDL 
.NLEEEN 


3757 


A 


413 


1 


PKPMLQQDFT/SLPDQGLDH1AE/NSYFDARSLCA 
AELVCKEWQQVTSE*MLWKKLIERMVHAYPLW 
KGLSEKVW/DQHLFKNRPTDGPPNSFHRSLYPKII 
QVIETIESNWQCG*HTLQRIQCHSEKSKGVYCLQ 
YDDEK 


3758 


A 


2 


613 


FVSGSPWRMDGSTERLEARRPAGRLPWSSRQEM 

TRRPSLMAGRQHGWSAQQSATVANPVPGANPD 

LLPHFLGEPEDVYrVKNKPVLLVCKAVPATQIFF 

KCNGEWVRQVDHVIERSTDGSSGLPTK4EVRINV 

SRQQVEKVFGLEEYWCQCVAWSSSGTTKSQKA 

YIRIA YLRKNFEQEPLAKEVSLEQGIVLPCRPPEGI 

PPAE 


3759 


A • 


1 


561 


ADDTLHLWKT.RQKRPAELHSLKFCRERVTFGHLP 

FQSKWLYVGTERGNIHIVNVESFTLSGYVIMWN 

KAIELSSKSHPGPWfflSDNPMDEGKLLlGFESGT 

VVLWDLKSKKADYRYTYDEAIHSVAWHHEGKQ 

FICSHSDGTLTIWNVRSPAKPVQUTPHGKQLKD 

GKKPEPCKPILKVEFXTTR 


3760 


A 


1 


824 


LPACRCGCVAGCPSNHGICRCLRASERQVCVMH 

LKHLRTLLSPQDGAAKVTCMAWSQNNAKFAVC 

TVDRWLLYDEHGERRDKFSTKPADMKYGRJCS 

YMVKGMAFSPDSTKIAIGQTDNHYVYKIGEDWG 

DBCKVICNKFIQTVKFRPVPGTLG*TNIYQYIYL*IQ 

PGVAFLTSECDFSYCKDGASWLFMVICCLP*SPA 

VSFPIGD*\SAVTCLQWPAEYIIVFGLAEGKVRLS 

NTKTNKSSTIYGTESYVVSLTTNCSGKGILSGHA 

DGYQR 


o / O 1 


A 






rvl^KLbyPYurSLLlbrrLKCVSETSQQPPSRKVF 
QLLPSFPTLTRSKSHESQLGNRIDDVS SMRFDLSH 
GSPQMVRRDIGLSVTHRFSTKSWLSQVCHVCQK 
SMIFGVKCKHCRLKCHNKCTKEAPACRISFLPLT 
RLRRTESWSDINNPVDRAAEPHFGTLPKALTKK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystcine, D=Aspartic Acid, 
E-=Glutamic Acid, ^Phenylalanine, Glycine. H^Histidine, 
IHfeoIeucine, K-Lysine, L»Leucine, M=Methionine, . 
N=Asparagine, P^Proline, Q^Glutamine, R^Arginine, S^Serine, 
T=Tbreonine, V^Valine, W=Tryptophan, V=Tyrosine, 
X*=Unknown» *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










EHPPAMNHLDSSSNPSSTTFSTPSSPAPFPTSSNPS 

SATTPP\NPSP\GQR\DSRFNFPSC/AYFIHHR\Q\QFI 

FPDISAFAHAAPLPEAADGTRLDDQPKADVLEAH 

EAEAEEPEAGKSEAEDDEDEVDDLPSSRRPWRG 

PISRKASQTSVYLQEWDDPFEQVELGEPIGQGRW 

GRVHRGRWHGEVAIRLLEMDGHNQDHLKLFKK 

EVMNYRQTRHENVVLFMGACMNPPHLAIITSFC 

KGRTLHSFVRDPKTSLDINKTRQIAQEIIKGMGYL 

HAKGIVHKDLKSRNVFYDNG\KVVITDFGLRGIS 

GVVP\EGRRENQLKLSHDWLCYLAPEIVR£MTPG 

KDEDQLPFSKAADVYAFGTVWYELQARDWPLK 

NQAAEASIWQIGSGEGMKRVLTSVSLGKEVSEN 

LSACWAFDLQERPSXFSLLMDMLEKLPKLNRRLS 

HPGHF*KSADINSSKWPRFERFGLGVLESSNPK 

M 


3762 


A 


2 


1578 


MAHYITFLCMVLVLLLQNSVLAEDGEVRSSCRT 
APTDLVFILDGSYSVGPENFEWKICWLVMTKNF 
DIGPKFIQVGWQYSDYPVLEIPLGSYDSGEHLTA 
AVESILYLGGNTKTGKAIQFALDYLFAKSSRFLT 
KIAVVLTDGKSQDDVKDAAQAARDSKITLFAIG 
VGSETEDAELRAIANKPSSTYVFYVEDYIAISKIR 
EVMKQKLCEESVCPTRIPVAARDERGFDILLGLD 
VNKKVKKRIQLSPKKIKGYEVTSKVDLSELTSNV 
FPEGLPPSYYFVSTQRFKVKKIWDLWRILTIDG/* 
PQIAVTLNGVDKILLFTTTSVINGSQVVTFANPQV 
KTLFDEGWHQIRLLVTEQDVTLYIDDQQIENKPL . 
HPVLGILINGQTQIGKYSGKEETVQFDVQKLRIY . 
CDPEQNNRETACEIPGFCLNGPSDVGSTPAPCICP - 
PGKPGLQGPKGDPGLPGNPGYPGQPGQDGKPVS 
. TESLVISGISGITG YQGIAGTPGVPGSPGIQG ARGL 
PGYKGEPGRDGDK . 


3763 


A 


3 


1267 


CKVWRNPLNLFTlGAEYmYTWVTGREPLTYYD 

MNLSAQDHQTFFTCDSDHLRPADAMQKAWRE 

ROTQARISAAHEALEINECATAYILLAEEEATTIA 

EAEKLFKQALKAGDGCYRRSQQLQHHGSQYEA 

QHSVLYLPLQ\TRHQCLGVHQKKASNVCQKTRE . 

DQGSSENDERFNEGVPPSEYVQYP*KPF\KALLEL 

QAYADVQAVLAKYDDISLPKSATICYTAALLKA 

RAVSDKFSPEAASRRGLSTAEMNAVEAIHRAVEF 

NPHVPKYLLEMKSLILPPEPGLKRGDSEAIAYAFF 

HLAHWKRVEGALNLLHCTWEGTFRMIPYPLEKG 

HLFYPYPICTETADRELLPSFHEVSVYPKKELPFFI 

LFTAGLCSFTAMLALLTHQFPELMGVFAKAVSV 

CLEGGLGEWMGKAKGIKAA 


3764 


A 


25 


1032 


RSADGLCGNKDRERGNEFTRNQQAAQEVVNPK 
KKMKKKKYVNSGTVTLLSFAVESECTFLDYIKG 
. GTQINFTVAIDFTASNGNPSQSTSLHYMSPYQLN . 
AYALALTAVGEnQHYDSDKMFPALGFGAKLPPD 
GRVSHEFPLNGNQENPSCCGIDGILEAYHRSLRT 
VQLYGPTNFAPVVTHVARNAAAVQDGSQYSVL 
Lll lDGVlSDMAQTKEArvrNG\SKLPMSin 
AEFN AMVELDGDDVRIS SRGKL AERDIVQF VPFR 
DYVDRTGNHVLSMARLARDVLAEPDQLVSYM 
KAQGIRPRSPPAAPTHSPSQSPARTPPACPLHTHI 


3765 


A 


172 


3456 


LGMMDSPKIGNGLPVIGPGTDIGISSLHMVGYLG 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C^Cysteine, D^Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H«Htstidine, 
JNIsoleucine, K^Lysine, l>Lcucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V«VaUne, W=Tryptophan, Y«Tyrosine, 
X~Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possib)e nucleotide insertion 










KNFDSAKVPSDEYCPACKEKGKLKALKTYRISFQ 

ES1FLCEDLQCIYPLGSKSLNNLISPDLEECHTPHK 

PQKRKSLESSYKDSLLLANSKKTRNYIAIDGGKV 

LNSKHNGEVYDETSSKTLPDSSGQQNPIRTADSLE 

RNEILEADTVDMATTKDPATVDVSGTGRPSPQN 

EGCTSKLEMPLESKCTSFPQALCVQWKNAYALC 

WLDCILSALVHSEELKNTVTGLCSKEESIFWRLL 

TKYNQANTLLYTSQLSGVKDGDCKKLTSETFAEI 

ETCLNEVRDEIFISLQPQLRCTLGDMESPVFAFPL 

LLKLETHIEKLFLYSFSWDFECSQCGHQYQNRH 

MKSLVTFTNVIPEWHPLNAAHFGPCNNCNSKSQI 

RKMVLEKVSPIFMLHFViEGLPQNDLQHYAFHFE 

GCLYQITS VIQYRANNHFITWILDADG S WLECDD 

LKGPCSERHKKFEVPASEIHIVIWERKISQVTDKE 

AACLPLKKTNDQHALSNEKPVSLTSCSVGDAAS 

AETASXOHPKDISVAPRTLSQDTAVTHGDHLLSG 

PKGLVDNILPLTLEETIQKTASVSQLNSEAFLXLEN 

KPVAENTGILKTNTLLSQESLMASSVSAPCNEKLI 

QDQFVDISFPSQVVNTOMQSVQLNTEDTVNTKS 

VNNTDATGLIQGVKSVEIEKDAQLKQFLTPKTEQ 

LKPERVTSQVSNLKKKETTADSQTTTSKSLQNQS 

LKENQKKPFVGSWVKGLISRGASFMPLCVSAHN 

RNTITDLQPSNOCGVNNFGGFKTKGINQKASHVSK 

KARKSASKPPPISKPPAGPPSSNGTAAHPHAHAA 

SEVLEKSGSTSCGAQLNHSSYGNGISSANHEDLV 

EGQIHKLRLKLRKKLKAEKKKLAALMSSPQSRT 

VRSENLEQVPQDGSPNDCESIEDLLNELPYPIDIA 

NESACTTVPGVSLYSSQTHEEELAELLSPTPVSTE 

LSENGEGDFRYLGMGDSH1PPPVPSEFNDVSQNT 

HLRQDHNYCSPTKKNPCEVQPDSLTONACVRTL ' 

NLESPMKTDIFDEFFSSSALNALANDTLDLPHFDE 

YLFENY 


3766 


A 


3 


1622 


AQQrVYRimiLENYKNLVSLGYQLTKPDVILRL 

EKGEEPWLVEREEHQETHPDSETAFEIKSSVSSRSI. 

FKDKQSCDIKMEGMAR2TOLWYLSLEEVWKCRD 

QLDKYQENPERHLRQVAFTQKKVLTQERVSESG 

KYGGNCLLPAQLVLREYFHKRDSHTKSLKHDLV 

LNGHQDSCASNSNECGQTFCQNIHLIQFARTHTG 

DKSYKCPDNDNSLTHGSSLGISKGIHREICPYECK 

ECGKFFSWRSNLTRHQLIHTGEKPYECICECGKSF 

SRSSHLIGHQKTHTGEEPYECKECGKSFSWFSHL 

VTHQRTHTGDICLYTCNQCGKSFA^HSSRLIRHQR 

THTGEKPYECPECGKSFRQSTHLILHQRTHVRVR 

PYECNECGKSYSQRSHLVVHHRIHTGLKPFECKD 

CGKCFSRSSHLYSHQRTHTGEKPYECHDCGKSFS 

QSSALIVHQRIHTGEKPYECCQCGKAFIRKNDLIK 

HQRMVGEETYKCNQCGIIFSQNSPFIVHQIAHTG 

EQFLTCNQCGTALVNTSNLIGYQTNHIRENAY 


3767 


A 


3 


1622 


AQQIVYRNVMLENYKNLVSLGYQLTKPDVILRL 
EKGEEPWLVEREIHQETHPDSETAFEIKSSVSSRSI 

QLDKYQENPERHLRQVAFTQBCKVLTQERVSESG 
KYGGNCLLPAQLVLREYFHKRDSHTKSLKHDLV 
LNGHQDSCASNSNECGQTFCQNIHLIQFARTHTG 
DKSYKCPDNDNSLTHGSSLG1SKGIHREKPYECK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc OCysteinc, D=Aspartic Acid, 
H>=Glutamic Acid, F=Phcnylalaninc, G=Glycine, H=Histidine, 
Msoleucine, K=Lysine, L^Leucine, M«Methionine, 
N=Asparagine,P-Proline, Q^Glutaminc, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 










ECGKFFSWRSNLTRHQLIHTGEKPYECKECGKSF 

SRSSHLIGHQKTHTGEEPYECKECGKSFSWFSHL 

VTHQRTHTGDKLYTCNQCGKSFA^HSSRLIRHQR 

THTGEKPYECPECGKSFRQSTHLILHQRTHVRVR 

FY ECNECGKS YSQRSHLVVHrlRIHTGLKPFECKD 

CGKCFSRSSHLYSHQRTHTGEKPYECHDCGKSFS 

QSSALIVHQRIHTGEKPYECCQCGKAF3RKNDLIK 

HQRIHVGEETYKCNQCGIIFSQNSPFIVHQIAHTG 

EQFLTCNQCGTALVNTSNLIGYQTKHIRENAY 


3768 


A 


185 


2258 


SIIIKMSRKISKESKKVNISSSLESEDISLETTVPTD 

DISSSEEREGKVRITRQLIERKELLHNIQLLKIELS 

QKTMME)NLK\nDYLTKIEELEEKLNDALHQKQL 

LTLRLDNQLAFQQKDASKYQELMKQEMETILLR 

QKQLEETNLQLREKAGDVRRSLRDFELTEEQYIK 

LKAFPEDQLSIPEYVSVRPYELVNPLRKEICELQV 

KKNILAEELSTNKNQLKQLTETYEEDRKNYSEV 

QIRCQRLALELADTKQLIQQGDYRQENYDKVKS 

ERDALEQEVIELRRKHEELEASHMIQTKERSELSK 

EYVTLEQTVTLLQKDKEYLNRQNMELSVRCAHE 

EDRLERLQAQLEESKKAREEMYEKYVASRDHY 

KTEYENKLHDELEQIRLKTNQEIDQLRNASREMY 

ERENRNLREARDNAVAEKERA VMAEKDALEKH 

DQLLDRYRE\LQ\LSTESKVTEFLHQSKLKSFESE 

RVQLLQEETARNLTQCQLECEKYQKKLEVLTKE 

FYSLQASSEKRITELQ AQNSEHQARLDIYEKLEK 

ELDEIIMQTAEIENEDEAERVLFSYGYGANVPTT 

AKRRLKQSVHLARRVLQLEKQNSLI/LKRSGTSK 

GPSNTAFTRSLTEANSLLNQTQQPYRYLIESVRQ 

RDSKDDSLTESIAQL/ERKDVSNLNKEKSALLQTN 

GIKMAL\DL\DQLLNHP 


3769 


A 


3 


.2297 


DAAEFRVVADAMKVIGFKPEEIQTVYKILAAILH 
LGNLKFVVDGDTPLIENGKVVSIIAELLSTKTDM , 
VEKALLYRTVATGRDUDKQHTEQEASYGRDAF 
AKAIYERLFCXVOVTRINDimVI^YDTTmGKNTV 
IGVLDIYGFEIFDNNSFEQFCINYCNEKLQQLFIQL 
VLKQEQEEYQREGIPWKHID YFNNQIIVDLVEQQ 
HKGIIAILDDACMNVGKVTDEMFLEALNSKLGK 
HAHFSSRKLCASDKILEFDRDFRIRHYAGDVVYS 
VIGFIDKNKDTLFQDFKRLMYNSSNPVLKNMWP 
EGKLSITEVTXRPLTAATLFKNSMIALVDNLASK 
EPYYVRCIKPNDKKSPQIFDDERCRHQVEYLGLL 
ENVRVRRAGFAFRQTYEKFLHRYKMISEFTWPN 
HDLPSDKEAVKKLIERCGFQDDVAYGKTKIFIRT 
PRTLFTLEELRAQMLIRI VLFLQK V WRGTLARMR 
YKRTKAALTiniYYRRYKVKSYIHEVARRFHGVK 
TMRDYGKHVKWPSPPKVLRRFEEALQTIFNRWR 
ASQLIKSEPASDLPQVRAKVAAVEMLKGQRADL 1 
GLQRAWEGNYLASKPDTPQTSGTFVPVANELKR 
KJjKYMNVLFSCHVRKVNRFSKVEDRAlFV 
. LYKMDPTKQYKVMKTEPLYNLTGLSVSNGKDQL 

NHFKSEKRHLQV\NVTNPVQCSLHGKKCTVSVE 
TRLNQPQPDFTKNRSGFILSVPGN 


3770 


A 


3 


6276 


HKVAAPDV VVPTLDTVRHE ALL YTWL AEHKPL " ' " 
VLCGPPGSGKTMTLFSALRALPDMEWGLNFSS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino - 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycinc, H-Histidine, 
I^Isoleucinc, K=Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P=ProIine, Q=GIutamine, R^Arginine, S=Serine, 
T=Threonine, V=Valine, W«=Tryptophan, Y^Tyrosine, 
X=Unkno>vn, *=Stop codon T /=possible nucleotide deletion, 
V=possible nucleotide insertion 










ATTPELLLKTFDHYCEYRRTPNGVVLAPVQLGK 
WLVLFCDEINLPDMDKYGTQRVISFIRQMVEHG 
GFYRTSDQTWVKLERIQFVGACNPPTDPGRKPLS 
HRFLRHWVVYYDYPGPASLTQIYGTFNRAMLR 
LIPSLRTYAEPLTAAMVEFYTMSQERFTQDTQPH 
Y1YSPREMTRWVRGIFEALRPLETLPVEGLIRJWA 
HEALRLFQDRLVEDEERRWTDENIDTVALKHFP 
NIDREKAMSRPILYSNWLSKDYIPVDQEELRDYV 
KARLKVFYEEELDVPLVLFNEVLDHVLRIDRJFR 
QPQGHLLLIGVSGAGKTTLSRFVAWMNGLSVYQ 
IKVrDUCYTGEDFDEDLRTVLRRSGCKNEKIAFIM 
DESNVLDSGFLERMNTLLANGEVPGLFEGDEYA 
TLMTQCKEGAQKEGLMLDSHEELYKWFTSQVIR 
NLHVVFTMNPSSEGLKDRAATSPALFNRCVLNW 
FGDWSTEALYQVGKEFTSKMDLEKPNYIVPDYM 
PVVYDKLPQPPSHREAIVNSCVFVHQTLHQANA 
RIAKRGGRTMAITPRHYLDFINHYANLFHEKRSE 
LEEQQMHLNVGLRKIKETVDQVEELRRDLRIKS 
QELEVKNAAANDKLKXMVKDQQEAEKKKVMS 
QEIQEQLHKQQEVIADKQMSVKEDLDKVEPAVI 
EAQNAVKSIKKQHLVEVRSMANPPAAVKLALES 
ICLLLGESTTDWKQIRSnMRENFIPTIVNFSAEEIS 
DAIREKMKKNYMSNPSYNYEIVNRASLACGPMV 
KWAIAQLNYADMLKRVEPLRNELQKLEDDAKD 
.NQQKANEVEQMIRDLEASIARYKEEYAVLISEAQ 
AIKADL AA VEAK VNRS TALLKSLS AERER WEKT . 
SETFKNQMSTIAGDCLLSAAFIAYAGYFDQQMR 
QNLFTTWSHHLQQANIQFRTDIARTEYLSNADER 
LRWQASSLPADDLCTENAIMLKRFNRYPLIIDPS 
GQATEFIMNEYKDRKITRTSFLDDAFRKNLESAL 
RFGNPLLVQDVESYDPVLNPVLNREVRRTGGRV 
LITLGDQDIDLSPSFVIFLSTRDPTVEFPPDLCSRV 
TFVNFTVTRSSLQSQCLNEVLKAERPDVDEKRSD 
LLKLQGEFQLRLRQLEKSLLQALNEVKGRILDDD 
. TIITTLENLKREAAEVTRKVEETDIVMQE VETV S 
QQYLPLSTACSSIYFTMESLKQIHFLYQYSLQFFL 
DIYHNVLYENPNLKGVTDHTQRLSnTXDLFQVA 
r^RVARGMLHQDHlTFAMLLARIKLKGTVGEPT 
YDAEFQHFLRGNEIVLSAGSTPRIQGLTVEQAEA 
WRLSCLPAFKDLIAKVQADEQFGIWLDSSSPEQ 
TVPYLWSEETPATPIGQAIHRLLLIQAFRPDRLLA 
MAHMFVSTNLGESFMSIMEQPLDLTQIVGTEVKP 
NTPVLMCSVPGYDASGHVEDLAAEQNTQITSIAI 
GSAEGFNQADKAINTAVKSGRWVMLKNVHLAP 
GWLMQLEKKLHSLQPHACFRLFLTMEINPKVPV 
NLLRAGRIFVFEPPPGVKANMLRTFSSIPVSRICK 
SPNERARLYFLLAWFHAIIQERLRYAPLGWSKKY 
EFGESDLRSACDTVDTWLDDTAKGRQNISPDKIP 
WSALKTLMAQSIYGGRVDNEFDQRLLNTFLERL 
FTTRSFDSEFKLACKVDGHKDIQMPDGIRREEFV 
nwpjvi t PDTnTPCWT m pxtmatjpatt t TTn/*j\/r> 

yvVV EiL,LirLs 1 v 1 ro Wi^vJJLr ININA.DK V LL, I 1 v^Lr V U 

MISKMLKMQMLEDEDDLAYAETEKKTRTDSTS 
DGRP\AWMRTLHTTASNWLHLEPQTLSHLKRTVE 
NIKDPLFRFFE\RE\n:<MGAKLLQ\DVRQDLADV\V 
QVCEGKKXQT^m.RTLIVNELV^GILP\RSWSHY 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

lorn firm 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
turrtspu iiumjj 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid,. 
E=GJutamic Acid, F^henylaJanine, G=Glycine, H=Histidine, 
I=Isoleucinc, K=Lysine, L=Leucine, M=Mcthionine, 
N^Asparagine, P^ProIine, Q=Glutamine, R = Arginine, S=Seriue, 
T«Threonine, V^Valine, \V=Tryptophau, Y=Tyrosine, 
X=Unknown t *=St6p codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










wpagmtviqwgvpisarrm:qlqnisl\aaasg 

swsleelclevnvttsqgatldacsfgvtglkl 
qgatcnnnklslsnaistalpltqlrwvkqtnt 
ekkasvvtlpvylnftradliftvdfeiatkedpr 
sfyergvavlcte 


3771 


A 


1 


2043 


LPLLHAGFNRRFMENSSIIACYNELIQ1EHGEVRS 

QFKLRACNSVFTALDHCHEAIEITSDDHVIQYVN 

PAFERMMGYHKGELLGKELADLPKSDKNRADL 

LDTINTCIKKGKEWQGVYYARRKSGDSIQQHVKI 

TTVIGQGGKIRHFVSLKKJLCCTTDNNKQIHKIHR 

DSGDNSQTEPHSFRYKNRRKESIDVKSISSRGSDA 

PSLQNRRYPSMARIHSMTIEAPITKVINIINAAQEN 

SPVTVAEALDRVLEILRTTELYSPQLGTKDEDPH 

TSDLVGGLMTDGLRRLSGNEYVFTKNVHQSHSH 

LAMPITINDVPPCISQLLDNEESWDFNIFELEAITH 

KRPL VYLGLKVFSRFGVCEFLNCSETTLRA WFQ 

VIEANYHSSNAYHNSTHAADVLHATAFFLGKER . 

VKGSLDQLDEVAALIAATVHDVDHPGRTNSFL\C 

NAGSELAVLYNDT\AV\LESHHTALAFQ\LTVKDT 

K\CNIFKNID/RGNHYRTLRQAIIDMVLATEMTKH 

FEHVNKFVNSINKPMAAEIEGSDCECNPAGKNFP 

EEYFAQTDEEKRQGLPWMPVFDRNTCSIPKSQI 
SFIDYFITDMFDAWDAFAHLPALMQHLADNYKH 
WKTLDDLKCKSLRLPSDRLKPSHRGGLLTDKGH 
CESQ 


3772 


A 


1013 


50 


TLVHADGFPSLHITETCLAYREKRIGIDLVHDTVE 

IteLIKEAEIIQGIMALLTRTLEEASEQIRMlSIRSAK 

YNLEKDLKDKFVALTIDDICFSLNNNSPNIRYSEN 

AVRIEPNSVSLEDWLDFSSTNVEKADKQKNNSL 

MLKALVDVRILSQTANYLRKQCDVVHTAFKNGL 

AILDQEGPAKVAHTRLETRTHRPNVELCRDVAQ 
YRLMKEVQEITHNVARLKETLA\QAQAELKGLH 
RRQLALQEEIQVKEOTIYIDEVLCMQMRBCSIPLR 
DGEDHGVWAGGLRPDAVC 


3773 


A 


1 


955 


AAARESERQLRLRLCVLNEILGTERDYVGTLRFL 

QSAFLHRIRQNVADSVEKGLTEENVKVLFSNIEDI 

LEVHKDFLAALEYCLHPEPQSQHELGNVFLKFK 

DKFCVYEEYCSNHEKALRLLVELNKIPTVRAFLL 

SCMLLGGRKTTDIPLEGYL\LSPIQRICKYPLLLKE 

RQMEKLEALEAAA/QSHIEGWEGSNLTDICTQLL 
LQGTLLKISAGMQERAFFLFDNLLVYCKRKSRV 
TGSKKSTKRTKSINGSLYIFRGRINTEVMEVENVE 
DGTGSPSPSLA 


3774 


A 


4254 


2061 


ELQGDFSVPDVPKSMAWCENSICVGFKRDYYLI 

RVDGKGSIKELFPTGKQLEPLVAPLADGKVAVG 

QDDLTVVLNEEGICTQKCALNWTDIPVAMEHQP 

PYHAVLPRYVEIRTFEPRLLVQSIELQRPRFITSGG 

SNIIYVASNHFVWRLIPVPMATQIQQLLQDKQFE 

LALQLAEMKDDSDSEKQQQIHHTKNLYAFNLFC 

QKRFDESMQVFAKLGTDPTHVMGLYPDLLPTDY 

RKQLQYPNPLPVLSGAELEKAHLAL1DYLTQKRS 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

liillailUIl 

corresponding 
to first amino 
acid residue of . 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspurtic Acid, 
E=Glutamic Acid, F=Phenylalanine, G-Glycine, H<=Histidine, 
l=Iso!eucine, K=»Lysine, L=Leucine, M==Methioninc, 
N=Asparagine, P*=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T«Thrconine, V^Valine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










QLVKKLNDSDHQSSTSPLMEGTPTlKSKKKLLQn 

DTTLLKCYLHTNVALVAPLLRLENNHCHIEESEH 

VLKKAHKYSELIIL YEKKGLHEKALQ VL VDQ SK 

KANSPLKGHERTVQYLQHLGTENLHLIFSYSVW 

VLRDFPEDGLKIFTEDLPEVESLPRDRVLGFLIEN 

FKGLAIP YLEHIIHVWEETG SRFHNGLIQLYCEK V 

QGLMKEYLLSFPAGKTPVPAGEEEGELGEYRQK 

LLMFLEISSYYDPGRLICDFPFDGLLEERALLLGR 

MGKHEQALFIYVHILKDTRMAEEYCHKHYDRN 

l^ r Tin'MVr^\7*V/T QT T "D A/TVT CDDCTIJPT /" , D*n/"T n T T?r»v 
ISJL'uiNJsaJ V i J_#oL»JLKM Y Lor r MriCLUr JJKXbLJJlPK. 

ANLQAALQVLELHHSKLDTTKALNLLPANTQIN . 
DIRIFLEKVLEENAQICKRWQVLKlsrLLHAEFLRV\ 
QEERILHQQVKCnTEEKVCMVCKKKIGNSAFAR 
YPNGVVVHYFCS\KEVNPADT 


3775 


A 


1832 


839 


MSRARGALCRACLALAAALAALLLLPLPLPRAP 

APARTPAPAPRAPPSRPAAPSLRPDDVFIAVKTTR 

KNHGPRLRLLLRTW\ISRARQQTFIFTDGDDPELE 

LQGGDRVINTNCSAVRTRQALCCKMSVEYDKFI 

ESGRKWFCHVDDDNYVNARSLLHLLSSFSPSQD 

A/VT ^DDCT TEUPTIT A TTJPAfnPPP in/' r"P\ fTST?\T 777 A T" • 

V i J^LxlvroJL-JJrtrlnAlJbKV^uOKI V 1 X YKr WrAT 

GGAGFCLSRGLALKMSPWASLGSFMSTAEQVRL 

PDDCTVGYIVEGLLGARLLHSPLFHSHLENLQRL 

PPDTLLQQVTLSHGGPENPQNVVNVAGGFSLHQ 

DPTRFKSIHCLLYPDTDWCPRQKQGAPTSR 


3776 


A 


3 ' 


796 . 


PRAKLGTRARNMAGQDAGCGRGGDDYSEDEGD 
SSVSRAAVEVFGKLICDLNCPFLEGLYITEPKTIQE 
LLCSPSEYRLEILEWMCTRVWPSLQDRFSSLKGV 
r 1 ti V KiyjClvi Jl iSJLuJnLc.jjJVlLUAr JJDVJilLJLKOL'ACA 
QKQLHFMDQLLDTIRSLTIGCSSCSSLMEHFEDT 
REKNEALLGELFSSPHLQMLLNPECDPWPLDMQ 
PLLNKQSDDWQWASASAKSEEEEKLAELARQLQ 
ESAAKLHALRTEYFAQHEQG AAAGAA\TSAP 


3777 


A 


3 


413 


ocilL/ V LcOrv I A v lntsJ\JOfLtvKooA\j V VJiL//i(j\Ji} Vvj 

NMLEG VG VDIM<ALL AKRKRLEMYTKASLRTSN 
QKEEHV WKTQQDQRQKLNQEYSQQFLTLFQQW 
DLDMQKAEEQEEKILVGIMIRFIINQVSSRNGQPS 
LLL 


3778 


A 


132 


788 


SRLPPPPPHLADGRAGARVPRSARLSRWWVQD 

WTOGPIVRPPAAARTMWVNPEEVLLANALWITE 

T? AXTPVPTT OPPk r /^i'Pf A nTinnnnnm Am t \rr ,r rr rv 
is-ftJNr i r iia^isJnjvOxIA UlJOUUOw 1 JLU 

VVLDSSARVAPYRILYQTPDSLVYWTIACGVGSR 

KEITEHWEWLEQNLLQTLSIFENENDITTFVRGKI 

QGIIAEYNKINDVKEDDDTEKFKEArVKFHRLFG 

MPEEEKLVNY Y SCS YWKG . . - 


3779 


A - 


2 


934 


CKSCTLFPQNPNLPPPSTRERPPGCKTVFVGGLPE 
NATEEIIQEVFEQCGDITAIRKSKXNFCHIRFAEEF 
MVDKAIYLSGYRMRLGSSTDKKDSGRLHVDFA 
QARDDFYEWECKQRMRAREERHRRKLEEDRLR . 
PPSPPAIMHYSEHEAALLAEKLKDDSKFSEAM\Q 

v LjLiO w ldaAjJI V In KisAoAiN K^r i oJVLV v^AIMori VK1\L 

MNEKATHEQEMEEAKENFKNALTGILTQFEQIV 
AVFNASTRQKAWDHFSKAQRKNIDIWAK\HSEE 
LRNAQSEQLMG1RREEEMEMSDDENCDSPTKKM 
RVDESALGAP 


3780 


A 


1 


2535 


AAQAEREELAAGRMPGGGPQGAPAAAGGGGVS 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location . 
* corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino - 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=G!ycine, H^Histidine, 
I=Isolcucine, K^Lysine, LHLeucine, M^Methionine, 
N"=Asparagine, P=ProIine, Q=Glutamine, R«Argininc, S^Serine, 
T=Threoninc, V=VaIine, W=Tryptophan, Y^Tyrosine, 
X=Unl£nown, *==5top codon, /^possible nucleotide deletion, 
possible nucleotide insertion 










HRAGSRDCLPPAACFRRRRLARRPGYMRSSTGP 

GIGFLSPAVGTLFRFPGGVSGEESHHSESRARQC 

GLDSRGLLVRSPVSKSAAAPTVTSVRGTSAHFGI 

QLRGGTRLPDRLSWPCGPGSAGWQQEFAAMDS 

SETLDASWEAACSDGARRVRAAGSLPSAELSSNS 

CSPGCGPEVPPTPPGSHSAFTSSFSFIRLSLGSAGE 

RGEAEGCPPSREAESHCQSPQEMGAKAASLDGP 

HEDPRCLSQPFSLLATRVSADLAQAARNSSRPER 

DMHSLPDMDPGSSSSLDPSLAGCGGDGSSGSGD 

AHSWDTLLRKWEPVLRDCLLRNRRQMEVISLRL 

KLQKLQEDAVENDDYDKAETLQQRLEDLEQEKI 

SLHFQLPSRQPALSSFLGHLAAQVQAALRRGATQ 

QASGDDTHTPLRMEPRLLEPTAQDSLHVSITRRD 

WLLQEKQQLQKE1EALQARMFVLEAKDQQLRRE 

IEEQEQQLQWQGCDLTPLVGQLSLGQLQEVSKA 

LQDTLASAGQIPFHAEPPETIRSLQERIKSLNLSLK 

EITTKVCMSEKFCSTLRKKVNDIETQLPALLEAK 

MHAISGNHFWTAKDLTEEIRSLTSDREGLEGLLS 

KLLVLSSRNVKKLGSVKEDYNRLRREVEHQETA 

YETSVKENTMKYMETLKNKLCSCKCPLLGKVW 

■ ~C A T~^T C A T T T/~\i~*T /"\T /"\T? A T> PCT 0\ f TT"NT?Ti /~\~K JTr\T\ 

cAlJLJ^UKJLLl(^CLQLv^bAKGSLSVEDERQM 

LEGAAPPIPPRLHSEDKRKTPLKESYE-SAELGEK 

CEDIGIOCLLYLEDQLHTAIHSHDEDLIQSLRRELQ 

MVKETLQAMILQLQPAKEAGEREAAASCMTAG 

VHEAQA 


3781 


A 


3 




GRRRAGPAHSARMYNMMETELKPPGPQQTSGG 
GGGNSTAAAAGGNQKNSPDRVKRPMNAFMVW 
SRGQRRKMAQENPKMHNSEISKRLGAEWKLLSE 
; TEKRPFIDEAKRLRALHMKEHPDYKYRPRRKTK 
TLMKKDKYTLPGGLLAPGGNSMASGVGVGAGL 
GAGVNQRMDSYAHMNGWSNGSYSMMQDQLG 

"VO/^lTLTD/^lT XT A 0/"* A A /"YK Af\Ti\ jTLTT> \/~r\"\ ro a t rwrvToi * 

Yr^JtlFoLiNArluAAQMyrMHRYDVSALQYN 

TSSQTYMNG/SRPTYSMSYSQQGTPGMAPGS\MG 

SVVKSEASSSPPVVTSSSHSRAPCQAGDLRDMIS 

MYLPGAEVPEPAAPSRLHMSQHYQSGPVPGTAI 

NGTLPLSHM 


3782 


A 


1 


2649 


FRVPDSCPVVLHSFTQLDPDLPRPESSTQEIGEELI 

NGVIYSISLRKVQLHHGGNKGQRWLGYENESAL 

NLYETCKVRTVKAGTLEKLVEHLVPAFQGSDLS 

YVTIFLCTYRAFTTTQQVLDLLFKRYGRCDALTA 

SSRYGCDLPYSDEDGGPQDQLKNAISSELGTWLD 

QYSEDFCQPPDFPCLKQLVAYVQLNMPGSDLER 

RAHLLLAQLEHSEPffiAEPEGEEDWALSPVPALK 

PTPELELALTPARAPSPVPAPAPEPEPAPTPAPGSE 

LEVAPAPAPELQQAPEPAVGLESAPAPALELEPA 

PEQDPAPSQTLELEPAPAPVPSLQPSWPSPVVAEN 

GLSEEKPHLLVFPPDLVAEQFTLMDAELFKKWP 

YHCLGSIWSQRDKKGKEHLAPTIRATVTQFNSV 

ANCVITTCLGNRSTKAPDRARWEHWIEVAREC 

RILKNFSSLYAILSALQSNSIHRLKKTWEDVSRDS 

FRIFOKLSEIFSDENlsry'SLSRELLIKEGTSKFATLE 

MNPKRAQKJRPKETGIIQGTVPYLGTFLTDLVML 

DTAMKDYLYGRLINFEKRRKEFEVIAQIKLLQSA 

CNNYSIAPDEQFGAWFRAVERLSETESYNLSCEL 

EPPSESASNTLRTKKNTAIVKRWSDRQAPSTELS 



434 



WO 01/57190 



PCT7US0 1/04098 



seq n> 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E^Clutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I«IsoIeucine, K=Lysine, LNLeucine, M=Methionine, 
N=Asparagine, P«Proline, Q^lutaminc, R<=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y«Ty rosin e, 
X=Unknown, * ss: Stop codon, /^possible nucleotide deletion, 
V-possible nucleotide insertion 










TSGSSHSKSCDQLRCGPYLSSGDIADALSVHSAG 

SSSSDVEEINISFVPESPDGQEKKFWESASQSSPET 

SGISSASSSTSSSSASTTPVAATRTHKRSVSGLCNS 

SSALPLYNQQVGDCCIIRVSLDVDNGNMYKSILV 

TSQDKAPAVIRKAMDKHNLEEEEPEDYELLQELS 

DDRKLKIPEN ANVFYAMN STANYDF VLKKRTFT 

KGVKVKHGASSTLPRMKQKGLK1AKGIF 


3783 


A 


3 


869 


RSGQGKVYGLIGKRRFQQMDVLEGLNLLITISGK 

R>JKLRV YYLS WLRNKILHNDPE VEKKQG WTTV 

GDMEGCGHYRWKYERIKFLVIALKSSVEVYAW 

APKPYHKFMAFKSFADLPHRPLLVDLTVEEGQR 

LKVIYGSSAGFHAVDVDSGNSYDIYIPVHIQSQIT 

PHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIK 

DWLQWGEMPTSVAY1CSNQIMGWGEKAIEIRS 

VETGHLDGVFMHKRAQRLKFLCERNDKVFFASV 

RSGGSSQVYFMTLNRMCIMNW 


3784 


A 


1213 


457 


LSPRQVDGLAGLQKGLSLSLLYQFLMNGIRLGTY 

GLAEAGGYLHTAEGTHSPARSAAAGAMAGVMG 

AYLGSPIYMVKTHLQAQAASE1AVGHQYKHQG 

MFQALTEIGQKHGLVGLWRGALGGLPRVrVGSS 

TQLCTFSSTKDLLSQWEIFPPQSWKLALVAAMM 

SGIAWL AMAPFDV ACTRLYNQPHRCTGQGP\LY 

RGELDALLQTARTEGEFGMYKGIGASYFRLGPHTI 

LSLFFWDQLRSLYYTDTK 


3785 


A 


193 


813 , 


RRRGRHSLCGGKMLAYCVQDATVVDVEKRRNP 

SKHYVYIINVTWSDSTSQTIYRRYXSKFFDLQMQL 

LD\KFP1\ESGQKDPKQRIIPFLPGKILFRRSHIRDV 

AVKRLKPIDEYCRALVRLPPHISQCDEVFRFFEAR 

PEDVNPPKEQGPSPPDAVLPYGVNKGKQELKAG 

PNWPGRTHHVVNCVTQKCLFVFHFKFSSSGNKE 

SKSL 


3786 

• 


A. 


3785 


1632 


EFVGRAASTTVVTRIAWRMADAG1RRVVPSDLY 

PLVLGFLRDNQLSEVANKFAKATGATQQDANAS 

SLLDIYSFWLNRSAKVPERKLQANGPVAKKAKK 

KASSSDSEDSSEEEEEVQGPPAKKAAVPAKRVGL 

PPGKAAAKASESSSSEESSDDDDEEDQKKQPVQ 

KGVKPQAKAGQAPPKKAKSSDSDSDSSSEDEPP 

KNQKPOTP\VTVKAQTKAPPKPARA\APKIANGK 

AASSSSSSSSSSSSDDSEEEKAAATPKKTVPKKQV 

VAKAPVKAATTPTRKSSSSEDSSSDEEEEQKKPM 

'KNKPGPYSSVPPPSAPPPKKSLGTQPPKKAVEKQ 

QPVESSEDSSDESDSSSEEEKKPPTKAVVSKATTK 

PPPAKKAAESSSDSSDSDSSEDDEAPSKPAGTTK 

NSSNKPAVTTKSPAVKPAAAPKQPVGGGQKLLT 

RKADSSSSEEESSSSEEEKTKKMVATTKPKATAK 

AALSLPAKQAPQGSRDSSSDSDSSSSEEEEEKTSK 

SAVKKKPQKVAGGAAPSKPASAKKGKAESSNSS 

SSDDSSEEEEEKLKGKGSPRPQAPKANGTSALTA 

QNGKAAKNSEEEEEEKKXAAVVVSKSGSLKKR 

KQNEAAKEAETPQAKKIKLQTPNTTPKRKKGEK 

T? A QCPCT? P\7 PTCCETC \7T\ CD \f A TYKT01?T"\ A VDP A A /~*T\ 

Jt^oorrKK V rvjiJbt.ix: V DMv V ALJNbrUAKKQjAACjD 

WGERANQVLKFTKGKSFRHEKTKKKRGSYRGG 
SISVQVNSIKFDSE 


3787 


A 


3 


5078 


IPEG/RALSAEHTSSLVPSLHITTLGQEQAILSGAV 
PASPSTGTADFPSILTFLQPTENHASPSPVPEMPTL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine OOysteinc, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I«Isolcucine, KHLysine, L= Leu cine, M-Methionine, 
N^Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptopban, Y=Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 








.' .■ ^ 


PAEGSDGSPPATRDLLLSSKVPNLLSTSWTFPRW 

KKDSVTAILGKNEEANVTIPLQAFPRKEVLSLHT 

VNGFVSDFSTGSVSSPIITAPRTOPLPSGPPLPSILS 

IQATQTVFPSLLAFSSTKPEVYAAAVDHSGLPAS 

APKQVRASPSSMDVYDSLTIGDMKKPATTDVFW 

SSLSAETGSLSTESIISGLQQQTNYDLNGHTISTTS 

WETHLAPTAPPNGLTSAADAKSQDFKDTAGHS 

VTAEGFSIQDLVLGTSIEQPVQQSDMTMVGSHID - 

LWPTSNNNHSRDFQTAEVAYYSPTTRHSVSHPQ 

LQLPNQPAHPLLLTSPGPTSTGSLQEMLSDGTDT 

GSEISSDINSSPERNASTPFQNILGYHSAAESSISTS 

VFPRTSSRVLRASQHPKKWTADTVSSKVQPTAA 

AAVTLFLRKSSPPALSAALVAKGTSSSPLAVASG 

PAKSSSMTTLAKNVTNKAASGPKRTPGAVHTAF 

PFTPTYMYARTGHTTSTHTA/IARKHGHCLWPW 

YNLP/PP/GKPQAMHTGLPNPTOLEMPRASTPRPL 

TVTAALTSITASVKATRLPPLRAENTDAVLPAAS 

AAVVTTGKMASNLECQMSSKLLVKTVLFLTQRR 

VQISESLKFSIAKGLTQALRKAFHQNDVSAHVDI 

LEYSHNVTVGYYATKGKLVYLPAWIEMLGVY 

GVSNVTADLKQHTPHLQSVAVLASPWNPQPAG 

YFQLKTVLQFVSQADNIQSCKFAQTMEQRLQKA 

FQDAERKVLNTKSNLTIQIVSTSNASQAVTLVYV 

VGNQSTFLNGTVASSLLSQLSAELVGFYLTYPPL 

TIAEPLEYPNLDISETTRDYWVITVLQGVDNSLV 

GLHNQSFAJRVMEQRLAQLFMMSQQQGRRFKRA 

TTLGSYTVQMVKMQRVPGPKDPAELTYYTLYN 

GKPLLGTAAAKILSTE)SQRMALTLHHWLLQ AD 

PVVKNPPNNLWIIAAVLAPIAVVTVIIIIITAVLCR 

KXKNDFKPDTMINLPQRAKPVQGFDYAKQHLG 

QQGADEEVIPVTQETVVLPLPIRDAPQERDVAQD 

GSTIKTAKSTETRKSRSPSENGSVISNESGKPSSGR 

RSPQNVMAQQKVTKEEARKRNVPASDEEEGAV 

LFDNSSKVAAEPFDTSSGSVQLIAIKPTALPMVPP 

TSDRSQESSA\TLNGEVNKALKQKSDIEHYRNKL . 

RLKAKRKGYYDFPAVETSKGLTERKKMYEKAP . 

KEMEHVLDPDSELCAPFTESKNRQQMKNSVYRS 

RQSLNSPSPGETEMDLLVTRERPRRGIRNSGYDT 

EPEIIEETNIDRVPEPRGYSRSRQVKGHSETSTLSS 

QPSIDEVRQQMHMLLEEAFSLASAGHAGQSRHQ 

EAYGSAQHLPYSEVVTSAPGTMTRPRAGVQWVP 

TYRPEMYQYSLPRPAYRFSQLPEMVMG SPPPP VP 

PRTGPVAVASLRRSTSDIGSKTRMAESTGPEPAQ 

LHDSASFTQMSRGPVSVTQLDQSALNYSGNTVP 

AVFAIPAANRPGFTGYFDPTPPSSYRNQAWMSYA 

GENELPSQWADSVPLPGYIEAYPRSRYPQSSPSRL 

PRQYSQPANLHPSLEQAPAPSTAASQQSLAENDP 

SDAPLIMSTAALVKAIREEVAKLAKXQTDMFEF 

QV 


3788 


A 


2 


1737 


MKGLYTDAEMKSDNVKDKDAKISFLQKAIDVV 

VMVSGEPLLABGPA^WAGHEPERTNELLQIIGKC 

CLNKLSSDDAVRRVLAGEKGEVKGRASLTSRSQ 

ELDNKNVREEESRVHKNTEDRGDAEIKERSTSRD 

RKQKEELKEDRMPREKDKDKEKAKENGGNRHR 

EGERERAKARARPDNERQKDRGNRERDRDSERK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C^Cysteine, D=Aspartic Acid, 
E=Glutnmic Acid, F^Phenylalanine, G==GIycine, H=Histidine, 
I-Isoleucine, K=Lysine, LHLeucine, M=*Methionine, 
N=Asparagine, P=Proline» Q=Glutamine, R^Arginine, S=Serine, 
T«=Threonine, V<=Vaiine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,£=pdssible nucleotide deletion, 
^possible nucleotide insertion 










KETERKSEGGKEKERLRDRDRERDRDKGKDRDR 

RRVKNGEHSWDLDRENNREHDKPEKKSASSGE 

MSKKLSDGTFKDSKAETETEISTRASKSLTTKTS 

KRRSKNSVEGDSTSDAEGDAGPAGQDKSEVPET 

PE1PNELSSNIRRIPRPGSARPAPPRVKRQDSMEAL 

QMDRSGSGKTVSNVITESHNSDNEEDDQFWEA 

APQLSEMSEIEMVTAVELEEEEKHGGLVKKILET 

KKDYEKLQQSPKPGEKERSLFESAWKKEKDIVS 

KEIEKLRTSIQTLCKSALPLGKIMDYIQEDVDAM 

QNELQM\YHSENRQHAEALQQEQRITDCAVEP\L 

KAELA\ELEQLIKD\Q\QDKICAVKANILKNEEKIQ 

KMVYSINLTSRR 


3789 


A 


1 


4369 


MRTLGTCLATLAGLLLTAAGETFSGGCLFDEPYS 

TCGYSQSEGDDFNWEQ VNTLTKPTSDP WMPSG S 

FMLVNASGRPEGQRAHLLLPQLKENDTHCIDFH 

YFVSSKSNSPPGLLNVYVKVNNGPLGNPIWNISG 

DPTRTWNRAELAISTFWPNFYQVIFEVITSGHQG 

YLAIDEVKVLGHPCTRTPHFLRIQNVEVNAGQFA 

TFQCSAIGRTVAGDRLWLQGIDVRDAPLKEIKVT 

SSRRFIASFNVVNTTKRDAGKYRCMI\RTEGGVGI 

SNYAELVWKEPPVPIAPPQLASVGATYLWIQLN 

ANSINGDGPIVAREVEYCTASGSWNDRQPVDSTS 

YKIGHLDPDTEYEISVLLTRPGEGGTGSPGPALRT 

RTKCADPMRGPRKLEVVEVKSRQITIRWEPFGY 

NVTRCHSYNLTVHYCYQVGGQEQVREEVSWDT 

ENSHPQHTlTNLSPYTm^S\n^ILMNPEGRKESQ 

ELIVQTDEDLPGAVPTESIQGSTFEEKIFLQWREP 

TQTYG VITLYEITYKAVSSFDPEIDLSNQSGRVSK 

LGNETHFLFFGLYPGTTYSFTIRASTAKGFGPPAT 

NQFTTKISAPSMPAYELETPLNQTDNTVTVMLKP 

AHSRGAPVSVYQIWEEERPRRTKKTTEILKCYP 

VPIHFQNASLLNSQYYFAAEFPADSLQAAQPFTIG 

DNKTYNGYWNTPLLPYKSYRIYFQAASRANGET 

KIDCVQVATKGAATPKPVPEPEKQTDHTVKIAG 

VIAGILLFVIIFLG V VLVMKKRKLVAKKRKETMS S 

TRQEIDLWIGELNGPRSYAEQGTKLATRAFSFMD 

THNLNGRSVSSPSSFTMKTNTLSTSVPNSYYPDE . 

TrTTMASDTSSLVQSHTYKKREPADVPYQTGQLH 

PAJRVADLLQHITQMKCAEGYGFKEEYESFFEGQ 

SAPWDSAKKDENRMK^fRYGNIIAYDHSRVRLQT 

IEGDTNSDYINGNYTDGYHRPNHYIATQGPMQET 

IYDFWRMVWHENTASIIMVTNLVEVGRVKCCK 

YWPDDTEIYKDKVTLIETELLAEYVIRTFAVEKR 

GVHEIREIRQFHFrGWPDHGVPYHATGLLGFVR 

QVKSKSPPSAGPLWHCSAGAGRTGCFIVIDIML 

DMAEREGWDIYNCVRELRSRRVNMVQTEEQY 

VFIHDAILEACLCGDTSVPASQVRSLYYDMNKLD 

PQTNSSQIKEEFRTLNMVTPTLRVEDCSIALLPRN 

HEKNRCMDELPPDRCLPFLITEDGESSNYTNAALM 

DSYKQPSAFIVTQHPLPNTVKDFWRLVLDYHCTS 

V V MJLN D V DFAQ LCr Q Y WPEN Cj V HRHGP1QVEF 

VSADLEEDIISRIFRIYNAARPQDGYRMVQQFQFL 

GWPMYRDTPVSKRSFLKLIRQVDKWQEEYNGG 

EGRTWHCLNGGGRSGTFCAISIVCEMLRHQRTV 

DWHAVKTLRNNKPNMVDLLDQYKFCYEVALE 
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SEQID . 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H^Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutaminc, R°Arginine, S=Serine, 
T=Threonine, V^Valine, W»Tryptophan, Y=Tyrosine, 
X=Unknown, *»Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 










YLNSG 


3790 


A . 


261 


485 


EEQTPLHIASRLGKTEIVQLLLQHMAHPDAATTN 
GYTPLHISAREGQV\DV\ASVLLGRQGAAHSFRLT 
KVRRMTS 


3791 


A 


1 


5874 


LPPVTMSGKYIMEEHDSYSDQVWSIDELPSKQG 

YYLQGNYLRCVAEVGSFEHNLTTDLLNHLVFVQ 

KWMKEVNEVIQKVSGGEQPIPLWNEHDGTADG 

DKPBOLLYSLNLQFKGIQVTATTTSMRAVRFETG ' 

LLELELSNRLQTKASPGSSSYLKLFGKCQVDLNL 

ALGQIVKHQV YEEAG SDFHQ VA YFKTRIGLKN A 

LREEISGSSDREAVLITLNRPIVYAQPVAFDRAVL 

FWLNYK\AAYDNWNEQRMALHKDIHMATKEVV 

DMLPGIQQTSAQAFGTPFLQLTVNDLGICLPITNT 

AQSNHTGDLDTGSALVLTIESTLITACSSESLVSK 

GHFKNFCIRFADGFETSWDDWKPEIHGDLVMNA 

CVWDGTYEVCSRTTGQAAAESSSAGTWTLNVL 

WICMCGIDVHMDPNIGKRLNALGNTLTTLTGEED 

IDDIADLNSVNIADLSDEDEVDTMSPTIHTEATDY 

RRQAASASQPGELRGRKIMKRIVDIRELNEQAKV 

IDDLKKLGASEGTINQEIQRYQQLESVAVNDIRR 

D VRKKLRRS SMRAA SLKDK WGLS YKPS YSRSKS 

1SASGRPPLKRMERASSRVGETEELPEIRVDAASP 

GPRVTFNIQDTFPEETELDLLSVTIEGPSHYSSNSE 

GSCSVFSSPKTPGGFSPGPFQTEEGRRDDSLSSTS 

EDSEICDEKDEDHERERF YI YRKPSHTS RKKATGF 

AAVHQLrTERWPTTPVNRSLSGTATEKNIDFELD 

IRVEIDSGKCVLHPTTLLQEHDDISLRRSYDRSSR 

SLDQDSPSKKKXFQTNYASTTHL^GKXVPSSL 

QTKPSDLETTVFYIPGVDVKLHYNSKTLKTESPN 

ASRGSSLPRTLSKESKLYGMKDSATSPPSPPLPST 

VQSKTOTLLPPQPPPIPAAKGKGSGGVKTAKLYA 

WVALQSLPEEMVISPCLLDFLEKALETIPITPVER 

NYTAVSSQDEDMGHFEIPDPMEES\TTSLVS\SSTS 

AYSSFPVDVVVYVRVQPSQIKFSCLPVSRVECML 

KLPSLDLVFSSNRGELEILGTTYPAETLSPGGNA 

TQSGTKTSASKTGIPGSSGLGSPLGRSRHSSSQSD 

LTSSSSSSSGLSFTACMSDFSLYVFHPYGAGKQIT 

AVSGLTPGSGGLGNVDEEPTSVTGRKDSLSINLE 

FVKVSLSRIRRSGGASFFESQSVSKSASKMDTTLI 

NISA VCDIGSASFKYDMRRLSEILAFPRA WYRRSI 

ARRLFLGDQTTNLPTSGPGTPD SIEGVSQHLSPES S 

RKAYCKTWEQPSQSASFTHMPQSPNVFNEHMTN 

STMSPGTVGQSLKSPASIRSRS VSDSS VPRRDSLS 

KTSTPFNKSNKAASQQGTPWETLVVFA1NLKQL 

NVQMNMSNVMGNTTWTTSGLKSQGRLSVGSNR 

DREISMSVGLGRSQLDSKGGWGGTIDVNALEM 

VAHISEHPNQQPSHKIQITMGSTEARVDYMGSSIL 

MGIFSNADLKLQDEWICVNLYNTLDSSITDKSEIF 

VHGDLKWDIFQVMISRSTTPDLIKIGMKLQEFFT 

QQFDTSKRALSTWGPVPYLPPKmTSNLEKSSQE 

QLLDAAHHRHWPGVLKVVSGCHISLFQPLPEDG 

MQFGGSMSLHGNHMTLACFHGPNFRSKSWALF 

HLEEPNIAFWTEAQKIWEDGSSDHSTYTVQTLDF 

HLGHhnTVTVTKPCGALESPMATITTOT 

HG V A S VKE WFNYVTATRNEELNLLRNVD ANNT 
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SEQD) 
NO: 


Method 

- 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=G lycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M^Methionine, 
N=Asparagine, P=ProIine, Q^Glutaminc, R=Arginine, S«Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosinc, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










ENSTTVKNSSLLSGFRGGSSYNHETETIFALPRM 

QLDFKSIHVQEPQEPSLQDASLKPKVECSVVTEF 

TDfflCVTMDAELIMFLHDLVSAYLKEKEKAIFPP 

RILSTRPGQKSPIirHDDNSSDKDREDSITYTTVDW 

RDFMCNT WHLEPTLRLIS WTGRKIDP VG VD YILQ 

KLGFHHARTTIPKWLQRGVMDPLDKVLSVLIKK 

LuTAl>QDEK£KKGKDKEEH 


3792 


A 


1 


364 


QNGSTPLHHAASKNRHEIALMLLEGGANPDGKD 
HYEATAKHQATAKGNFKMIHILLYYKASTIIQDT 
EGNTPPHLVCD\RVEEAKLLVSQGA/SIYIE2v[KEE 
lUJP/LQVAKGALGLv^KRMVEG 


3793 


A 


2 


340 


DIVPNPKMAPLGDEAPTLEKVLTPELSEEEVSTR 
DDIQFHHFSSEEALQKVKYFVAKEDPSSQEEAHT 
PEAPPPQPPSSERCLGEMKCTLVRGDSSPRQAEL 
KSGPASRPAL 


3794 


A 


421 


15S 


SYWGEDYTYKFFEVILIDPFHKJVIRRNPDTQWI 
SKAVYKHREMCGLTSTGRKSHGLEKDRMFPHAI 
GGSCRAA*RRRKTLQFPCYH 


3795 


A 


24 


592 


GGMDSRVSGTTSNGETKPVYPVMEKKEEDGTLE 
RGHWNNKMEFVLS VAGEIIGLGNVWRFP YLCYK 
. NGGGAFFIPYLVFLFTCGIPVFLLETALGQYTSQG 
G\nTAWRKICPIFEGIGYASQMIVILLNVYYnVLA 
WALFYLFSSFTIDLPWGGCYHEWNTEHCMEFQK 
TNGSLNGTSENATSPVIEFW 


3796 


A 


3 


592 


KP ASTYSTSQPS MAPLLPIRTLPLIL1LLALLSPG A . 

ADFNISSLSGLLSPALTESLLVALPPCHLTGGNAT 

LMVRRANDSKVVTSSFVVPPCRGRRELVSVVDS 

GAGFTVTRLSAYQVTNLVPGTKFYISYLVKKGT 

ATESSREIPMFTLPRRNMESIGLGMARTGGMVVI 

TVLLSVAMFLLVLGFIIALALGSRK 


3797 


A 


1 


1556 


ATRLLRGSGSWGCSRLRFGPPAYRRFSSGGAYPN 

IPLSSPLPGVPKPVFATVDGQEKFETKVTTLDNGL 

RVASQNKFGQFCTVGILINSGSRYEAKYLSGIAH 

FLEKLAFSSTARFDSKDEILLTLEKHGGICDCQTS 

RDTTMYAVSADSKGLDTWALLADWLQPRLT 

DEEVEMTRMAVQFELEDLNLRPDPEPLLTEMIHE 

AAYRENTVGLHRFCPTENVAKINREVLHSYLRN 

YYTPDRMVLAGVGVEHEHLVDCARKYLLGVQP 

AWGSAEAVDIDRSVAQYTGGIAKLERDMSNVSL 

GPTPIPELTHMVGLESCSFLEEDFIPFAVLNMMM 

GGGGSFS AGGPGKGMFSRLYLNVLNRHHWMYN 

ATSYHHSYEDTGLLCIHASADPRQVREMVEIITK 

EFILMGGTVDTVELERAKTQLTSMLMMNLESRP 

VIFEDVGRQVLATRSRKLPHELCTLIRNVKPEDV 

KRVASICMLRGKPAVAALGDLTDLPTYEHIQTAL 

SSKDGRLPRTYRLFR 


3798 


A 


73 


759 


ICRLVEAGVPRTFDGIVGEGGAQSRSCWPWGVTA 
QTPAFSADSLNCLKNCMSITMGSVRPSVEQFHKY 
LPWFLNDRPNIKCPKGGLAAYSTSVKLTSDGQV 
LASRFMAYHKPLKNSQDYTEALRAARELAANIT 

APiT "RTT VPfJTfYP AFPVPPVTTTKTVTTVTJAVT TTT DT?^ 

/uu'i-*ivjv vrui \jr t\r cvrr i lJiiNvri nv^ i Li 1 li^Jrilvj 

LFMLSLCLVPTFAVSCLLLGLDLRSGLLNLLSIV 

MILVDTVGFMALWGISYNAVSLINLVS 


3799 


A 


73 . 


759. 


KRLVEAGVPRTFDGIVGEGGAQSRSCWPWGVTA 
QTPAFSADSLNCLKNCMSITMGSVRPSVEQFHKY 
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SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locution 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc OrCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Pheny1 alanine, G=Glycine,H s =Histidine 1 
l=Isoleucine, K^Lysine, L=Leucinc, M- Methionine, 
N=Asparagine, P*=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W==Tryptophan, Y=Tyrosine, 
X^Un known, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










LPWFL>TORPNIKCPKGGLAAYSTSVNLTSDGQV 

LASRFMAYHKPLKNSQDYTEALRAARELAANIT 

ADLRKVPGTDPAFEVFPYTITNVFYEQYLTILPEG 

LFMLSLCLVPTFAVSCLLLGLDLRSGLLNLLS1V 

MILVDTVGFMALWGISYNAVSLrNLVS 


3800 


A 


250 


1032 


G1FRSLRVLFPLFSVGRPQFARSLSAAPQLSDTAD 

TMGFGDLKSPAGLQVLNDYLADKSYIEGYVPSQ 

ADVAVFEAVSSPPPADLCHALRWYNHIKSYEKJE 

KASLPGVKKALGKYGPADVEDTTGSGATDSKD 

DDDIDLFG SDDEEESEEAKRLREERLAQ YESKKA 

KKPALVAKSSILLDVKPWDDETDMAKJLEECVRS 

IQADGLVWGSSKLVPVGYGIKKLQIQCVVEDDK 

VGTDMLEEQITAFEDYVQSMDVAAFNKI 


3801 


A 


155. 


656 


SREMELVTFRDVAIEFSPEEWKCLDPAQQNLYR 

DVMLENYRNLVSLGFVISNPDLVTCLEQIKEPCN 

LKIHETAAKPPAICSPFSQDLSPVQGIEDSFHKLIL 

KRYEKCGHENLQLRKGCKRVNECKVQKGVNNG 

VYQCLSTTQSKIFQCNTCVRVFSTSSHSNKHK 


3802 


A 


1 


1428 


VTVSPETHMDLTKGCVTFEDIAIYFSQDEWGLLD 

EAQRLLYLEVMLENFALVASLGCGHGTEDEETP 

SDQNVSVGVSQSKAGSSTQKTQSCEMCVPVLKD 

ILHLADLPGQKPYLVGECTNHHQHQKHHSAKKS 

LKRDMDRASYVKCCLFCMSLKPFRKWEVGKDL 

PAMLRLLRSLVFPGGKKPGT1TECGEDIRSQKSH 

.YKSGECGKASRHKHTPVYHPRVYTGKKLYECSK 

CGKAFRGKYSLVQHQRVHTGERPWECNECGKF 

FSQTSHLNDHRRIHTGERPYECSECGKLFRQNSS 

LVDHQKIHTGARPYECSQCGKSFSQICATLVKHQ 

RVHTGERPYKCGECGNSFSQSAILNQHRR1HTGA 

KPYECGQCGKSFSQKATLIKHQRVHTGERPYKC 

GDCGKSFSQSSIL1QHRRIHTGARPYECGQCGKSF 

SQKSGLIQHQVVHTGERPYECNKCGNSFSQCSSL 

IHHQKCHNT 


3803 


A 


193 


617 


LFPFLGSESKNGEADSSDKE3V1KHGQKSPTGKQTS 
QHLKRLKKSGLGHLK WTKAEDIDBETPGSILVNT 
NLRALINKHITASLPQHFQQYLLLLLPEVDRQMG 
SDGILRLSTSALNNEFFAYAAQGWKQRLAEGKF 
VFSIIM 


3804 


A . 


197 


479 


SSSRASPPEHPSSQAHCGPLVLSHACPEVTNKWS 
TGSSSSPNSSWVSSPLQPEGLSGSSRMKGGSATKI 
LLETLLLAAHMTADQGIASSQRCLL 


3805 


A 


1 


385 


QSADTLFPGDINFNVSGLFSAVTLQDTVSDRLAS 
EELPSTAVPTPATTPAPAPAPAPATAPALVSAAT 
KERTESEVPPRPASPKVTRSPPETAAPVEDMARR 
SELAVGGEEGTEGGRGEGTGSPMSSY 


3806 


A 


47 


1033 


LQGDTWHLSFLSHFSRLHGGVPGRGLLEGNLLQ 

PQAPGHDMTSrPFPGDRLLQVDGVILCGLTHKQA 

VQCLKGPGQVARLVLERRVPRSTQQCPSANDSM 

GDERTAVSLVTALPGRPSSCVSVTDGPKF*SSN* 

KRIANGLGFSFVQMEKESCSHLKSDLVRDCRLFP 

GHPAEENG AI AAGDIELGRE WEGPRKAS S SRCRG 

SWAMQLSVQAGPSFASYYPAAVEVLHLLRGAPQ 

EVTLLLCRPPPGALPELEQEWQTPELSADKEFTR 

ATCTDSCTSPILGSRGQLGGTVPPQMQGKAWGL 

RPESSQKAIREGTMGAKTERDLGPVP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence <A=AJanine OCysteine, D=Asparric Acid, 
E»Glutamic Acid, ^Phenylalanine, G=Glycine, H<=Histidinc, 
l^Isolcucine, K=Lysine, IHLeucinc, M= Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threoninc, V«=Valinc, W-Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /=possib!e nucleotide deletion, 
V=possible nucleotide insertion 


3807 


A 


656 


1238 


RCPSLLPPSWPLPTLQTLTRTPGNKAIAGGAGLW 

A V L WGSERTPP YR* ON *N QRG A VPCLRPHRLRP 

QDKFLVLASDGLWDMLSNEDVVRLVVGHLAEA 

DWHKTDLAQRPANLGLMQSLLLQRKASGLHEA 

DQNAATRLIRHAIGNNEYGEMEAERLAAMLTLP 

EDLARMYRDDITVTVVYFNSESIGAYYKGG 


3808 


A 


26 


2195 


SQYSESVAGRQASPERLLGSYHAMASTVEGGDT 

ALLPEFPRGPLDAYRARASFSWKELALFTEGEG 

MLRPKKTTFSALENDPLFARSPGADLSLEKYREL 

NFLRCKRIFEYDFLSVEDMFKSPLKVPALIQCLG 

MYDSSLAAKYLLHSLVFGSAVYSSGSERHLTYIQ 

KIFRMEIFGCFALTELSHG SNTKAIRTTAHYDP AT 

EEFimSPDFEAAKFWVGNMGKTATHAVVFAKL 

CVPGDQCHGLHPFIVQIRDPKTLLPMPGVMVGD1 

GKKLGQNGLDNGFAMFHKVRVPRQSLLNRMGD 

VTPEGTYVSPFKDVRQRFGASLGSLSSGRVSrVSL 

AILNLKLAVAIALRFSATRRQFGPTEEEEIPVLEY 

PMQQWRLLPYLAAVYALDHFSKSLFLDLVELQR 

GLASGDRSARQAELGREIHALASASKPLASWTT 

QQGIQECREACGGHGYLAMNRLGVLRDDNDPN 

CTYEGDNNILLQQTSNYLLGLLAHQVHDGACFR 

SPLKSVDFLDAYPGILDQKFEVSSVADCLDSAVA 

LAAYKWLVCYLLRETYQKLNQEKRSGSSDFEAR 

NKCQVSHGRPLALAFVELTVVQRFHEHVHQPSV 

PPSLRAVLGRLSALYALWSLSRHAALLYRGGYF 

SGEQAGEVLESAVLALCSQLKDDAVALVDV1AP 

PDFVLDSPIGRADGELYKNLWGAVLQESKVLER 

ASWWPEFSVNKPVIGSLKSKL 


3809 


A 


117 


830 


GFGIMERVGCTLTTTYArlPRPTPTNFLPAISTMAS 

SYRDRFPHSNLTHSLSLPWRPSTYYIWASNSPSV 

APYCTRSQRVSENTMLPFVS>7RTTFFTRYTPDDW 

YKSNLTWQESNTSRHNSEKLRVDTS^ 

QraKTQADTTQ^GER\^IGFWKSEIIHELDEM 

IGETNALTDVKKRLERALMETEAPLQVARECLF 

HREKRMGIDLVHDEVEAQLLTVNVGEMHQSQA 

A 


3810 


A 


3. 


518 


VIQELEGGSGADLGEHSCRPASQPRFPRPAEARS 

HrA 1 KKrAoGr AMGKTNSKLAPEVLEDLVQNTE 

FSEQELKQWYKGFLKDCPSGILNLEEFQQLYIKF 

FPYGDASKFAQHAFRTFDKNGDGTIDFREFICAL 

SVTSRGSFEQKLNWAFEMYDLDGDGRJTRLEML" 

EIIE 


3811 


A 


81 


1147 . .. - 


GCGYGCSGAGGAAIGEPMAKWGEGDPRWIVEE 

RADAT>T\/NNWHWTERDASNWSTDKLKTLFLAV 

QVQNEEGKCEVTEVSKLDGEASINNRKGKLIFFY 

EWSVKLNWTGTSKSGVQYKGHVEIP>JLSDENSV 

DEVEISVSLAKDEPDTNLVALMKEEGVKLLREA 

MGIYISTLKTEFTQGMTLPTMNGESVDPVGQPAL 

l^TCTJD 1/" A VTi A "DOT/" T"/~\ A TlTll/OAfT/mTr'T/'TT'T T/rTI ~'T 

K liitKJ^AKPAr 1 QARF VG VKIPTCKJTLKETFL 
TSPEELYRVFTTQEL VQ AFTHAPATLEADRGGKF . 
HMVDGNVSGEFTDLWFKHTVMKWRFK^WPFn 
HFATITLTFIDKKGETELCMEGRGIPAPEEERTRQ 
GWQRYYFEGIKQTFGYGARLF 


3812 


A 


20 


558 


PCGTAASTHAYDRRAKCRQQQQQQQNGGQNKV 
RPAKKKTSPAREVSSESGTSGQrTPPSSTSVPTIAS 
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SEQID 
NO: 


Method 

■ ' '. • 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»Alanine OCystcinc, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanfne, G«Glycine, H=Histidinc, 
I=Isofeucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Argininc, S*=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










SSAPVSIWSPASISPLSDPLSTSSSCMQRSYPMTYT 
QASGYSQGYAGSTSYFGGMDCGSYLTPMHHQL 
PGPGATLSPMGTNAVTSHLNQSPASLSTQGYGAS 
KLWGFNFNH 


3813 


A 


1 


1016 


CTEPPRRSTRTPAALASLRPYTDYVVVSDQILQES 

EDFFTLIESHEGKPLKLMVYNSKSDSCREVTVTP 

NAAWGGEGSLGCGIGYGYLHiaPTQPPSYHKKPP. 

GTPPPSALPLGAPPPDALPPGPTPEDSPSLETGSRQ 

SDYMEALLQAPGSSMEDPLPGPGSPSHSAPDPDG 

LPHFMETPLQPPPPVQRVMDPGFLDVSGISLLDN 

SNASVWPSLPSSTELTTTAVSTSGPEDICSSSSSHE 

RGGEATWSGSEFEVSFLDSPGAQAQADHLPQLT 

LPDSLTSAASPEDGLSAELLEAQAEEEPASTEGLD 

TGTEAEGLDSQAQISTTE*HPGL*QGP 


3814 


A 


2 


884 


VFWQVRNAGSSPLSAACPLFRTPAPQPCGSWGR 

CCIPHASTGCRPMAERGELDLTGAKQNTGVWLV 

KVPKYLSQQWAKASGRGEVGKLRIAKTQGRTE 

VSFTLNEDLANIHD1GGKPASVSAPREHPFVLQSV 

GGQTLTVFTESSSDKLSLEGIWQRAECRPAASE 

NYMRLKRLQIEESSKPVRLSQQLDKVVTTNYKP 

VANHQYNIEYERKKKEDGKRARADKQHVLDML 

FSAFEKHQYYNLKDLVDITKQPWYLKEILKEIG 

VQNVKGIHKNTWELKPEYRHYQGEEKSD 


3815 


A . 


17. 


411 


NIGDWEDIGKSPERIIQYYGPATWAQDGSRGYCT 
PIYMLNmiRLQAVLEIIMNERANALDLLAQQTTK 
MRNANYQNRLALDYLLAHEGGV*GKFSLTNCC 
LEIDDNGKAIMEITARMRKLAHIPVQTWER 


3816 


A 


3 


1172 


SHWQRJUDRRCVRNMAERGRKRPCGPGEHGQRI 

EWRKWKQQKKEEKKKWKDLKLMKKLERQRAQ 

EEQAKRLEEEEAAAEKEDRGRP YTLS V ALPG SIL 

DNAQSPELRTYLAGQIARACABFCVDEIVVFDEE 

GQDAKTVEGEFTGVGKKGQACVQLARELQYLEC 

PQYLRKAFFPKHQDLQFAGLLNPLDSPHHMRQD 

EESEFREGVVVDRPTRPGHGSFVNCGMKKEVKI 

DKNLEPGLRVTVRLNQQQHPDCKTYHGKWSS 

QDPRTKAGLYWGYTVRLASCLSAVFAEAPFQDG 

YDLTIGTSERGSDVASAQLPNFRHALVVFGGLQG 

LEAGADADPNLEVAEPSVLFDLYVNTCPGQGSR 

TIRTEEAILISLAALQPGLIQAGARHT 


3817 


A 


246 


1197 


FLSAGMSNFTHYAYLLMIESLMLGKVPPHVPSH 

HFIFHDDGSARQKGESDYKVIIQQWFSKSGPWTT 

SSNVTWGLLELQQSISESAVLTIPPGDSGAGSNLI 

TMFLRNRKETDLCSGRSKVNRGWNSGRCKQRG 

KTEQPGEPLEHVYVTIKHAVALESRHQKGELQC 

LIKMCIPLSI^LQMFFSPPHWEAWLQRVQQLAK 

NTRYFRQRLQEMGFnYGNENASVVPLLLYMPG 

K V A AF ARHMLEKKJG VV W GFP ATPL AE ARARF 

CVSAAHTREMLDTVLEALDEMGDLLQLKYSRH 

KKSARPELYDETSFELED 


3818 


A 


215 


789 


NPQSSSSEGSSEEFQVNGHNRLLVQRSEVTQAPG 

V>i i VUVJtiuiiUUlrlv^AlLKy JN VLLrKKASOFSLS 

LEIVKNYSSTAFDLTVTLKYTGIRNKSSMVVIDV 

KMLSGFTPTMSSffiELENKGQVMKTEVKNDHVL 

FYLENWGRADSFTFSVEQSNLVFN1QPAPGMVY 

DYYEKEEYALAFYHINSSSVSE 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystetne, D=Aspartic Acid, 
E^Glutamic Acid, ^Phenylalanine, G*=Glycine, H=Histidine, 
T=Isoleucine, K=Lysine, I^=Leucine, M=Methioninc ? 
N«Asparagine, P=ProIine, Q^Glutamine, R=Arginine, S=Serine, 
1>Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 


3819 


A 


1 


1483 


RIPDSIISRGVQGLPRDTASLSTTPSESPRAQATSR 

LSTASCPTPKVQSRCSSKENILRASHSAVDriTCVA 

RRHRMSPFPLTSMDKAFITVLEMTPVLGTEIINYR 

DGMGRVLAQDVYAKDNLPPFPASVKDGYAVRA 

ADGPGDRFnGESQAGEQPTQTVMPGQVMRVTT 

GAPIPCGADAWQVEDTELIRESDDGTEELEVRIL 

VQARPGQDIRPIGHDIKRGECVLAKGTHMGPSEI 

GLLATVGVTEVEVNKFPVVAVMSTGNELLNPED 

DLLPGKIRDSNRSTLLATIQEHGYPTINLGIVGDN 

PDDLLNALNEGISRADVIITSGGVSMGEKDYLKQ 

VLDIDLHAQIHFGRVFMKPGLPTTFATLDIDGVR 

KIIFALPGNPVSAVVTCNLFVVPALRKMQGILDP 

RPTIIKARLSCDVKLDPRPEYHRCILTWHHQEPLP 

WAQSTGNQMSSRLMSMRSANGLLMLPPKTEQY 

VELHKGEVVDVMVIGRL 


3820 


A 


2216 


487 


PQEPALKSEFSQVASNTIPLPLPQPNTCKDNGPCK 

QVCSTVGGSAICSCFPGYAIMADGVSCEDQDECL 

MGAHDCSRRQFCVNTLGSFYCVNHTVLCADGY1 

LNAHRKCVDINECVtDLHTCSRGEHCVNTLGSF 

HCYKALTCEPGYALKDGECEDVDECAMGTHTC 

QPGFLCQNTKGSFYCQARQRCMDGFLQDPEGNC 

VDINECTSLSEPCRPGFSCINTVGSYTCQRNPLIC 

ARGYHASDDGTKCVDVNECETGVHRCGEGQVC 

HNLPGSYRCDCKAGFQRDAFGRGCEDVNECWAS 

PGRLCQHTCENTLGSYRCSCASGFLLAADGKRC 

EDVNECEAQRCSQECANIYGSYQCYCRQGYQLA 

EDGHTCTDIDECAQGAGILCTFRCLNVPGSYQCA 

CPEQGYTMTANGRSCKDVDECALGTHNCSEAET 

CHNIQGSFRCLRFECPPNYVQVSKTKCERTTCHD 

FLECQNSPARITHYQLNFQTGLLVPAHJFRIGPAP 

AFTGDTIALNI1KGNEEGYFGTRRLNAYTGVVYL 

QRA VLEPRDF ALD VEMKL WRQG S VTTFLAKMHI 

FFTTFAL 


3821 


A 


2216 


487 


PQEPALKSEFSQVASNTIPLPLPQPNTCKDNGPCK. 

QVCSTVGGSAICSCFPGYAIMADGVSCEDQDECL 

MGAHDCSRRQFCVNTLGSFYCVNHTVLCADGY1 

LNAHRKCVDINECVTDLHTCSRGEHCVNTLGSF / 

HCYKALTCEPGYALKDGECEDVDECAMGTHTC 

QPGFLCQNTKGSFYCQARQRCMDGFLQDPEGNC 

VDINECTSLSEPCRPGFSCINTVGSYTCQRNPLIC 

ARG YHASDDGTKC VD VNECETGVHRCGEGQVC 

HNLPGSYRCDCKAGFQRDAFGRGCIDVNECWAS - 

PGRLCQHTCENTLGSYRCSCASGFLLAADGKRC 

EDVNECEAQRCSQECANIYGSYQCYCRQGYQLA 

EDGHTCTDIDECAQGAGE.CTFRCLNVPGSYQCA 

CPEQGYTMTANGRSCKDVDECALGTHNCSEAET 

CHN1QGSFRCLRFECPPNYVQVSKTKCERTTCHD 

FLECQNSPARITHYQLNFQTGLLVPAHIFRIGPAP 

AFTGDTIALNIIKGNEEGYFGTRRLNAYTGWYL 

QRA VLEPRDFALD VEMKL WRQGSVTTFLAKMHI 

"PPTTT? AT 


3822 


A 


2502 


1540 


MAAATRGCRPWGSLLGLLGLVSAAAAAWDLAS 
LRCTLGAFCECDFRPDLPGLECDLAQHLAGQHL 
AKALVVKALKAFVRDPAPTKPLVLSLHGWTGTG 
KSYVSSLLAHYLFQGGLRSPRVHHFSPVLHFPHP 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide . 
location 
' corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=rAlanine OCysteine, D=Aspartic Acid, 
IMSlutamic Acid, ^Phenylalanine, G=Glycine, H-Histidine, 
I=Isoleucine, K=Lysinc, I>Lcucine, M=Methionine, 
N=Asparagine, P=?roline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V~Valine, W=Tryptophan, Y=Tyrosine, 
X«Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










SHIERYKKDLKSWVQGNLTACGRSLFLFDEMDK 

UppHT X/TPVT PPT7T n^Q\XA/VVnTKTVP V A TUrCTOXT 
lvirr OLlviii V LKrrLUoj W V V iulIN x rUNAlrJUrioN 

TGGEQINQVALEAWRSRRDREEILLQELEPVISR 

AVLDNPHHGFSNSGIMEERLLDAVVPFLPLQRHH 

VRHCVLNELAQLGLEPRDEWQAVLDSTTFFPE 

DEQLFSSNGCKTVASRIAFFL 


3823 


A 


i 


3174 


YGCEKTTEGRIPLKNIYRLFSADRKRVETALEAC 

SLPSSRNDSIPQEbFinPEVYRVFLNNLCPRPEIDNI 

FSEFGAKSKPYLTVDQMMDFINLKQRDPRLNEIL 

YPPLKQEQVQVLIEKYEPNNSLARKGQISVDGFM 

RYLSGEENGVVSPEKLDLNEDMSQPLSHYFINSS 

HNTYLTAGQLAGNSSVEMYRQVLLSGCRCVELD 

CWKGRTAEEEPVITHGFTMTTEISFKEVIEAIAEC 

AFKTSPFPILLSFENHVDSPKQQAKMAEYCRLIFG 

DALLMEPLEKYPLESGWLPSPMDLMYKILVKN 

KKKSHKSSEGSGKKKLSEQASNTYSDSSSMFEPS 

SPGAGEADTESDDDDDDDDCKKSSMDEGTAGSE 

AMATEEMSNLV7s[YlQPVCTESFEISKKRNKSFEM 

SSFVETKGLEQLTKSPVEFVEYNKMQLSRIYPKG 

TRVDSSNYMPQLFWNAGCQMVALNFQTMDLA 

MQmMGMYEYNGKSGYRLKPEFMRRPDKHFDP 

FTEGIVDGIVANTLSVKIISGQFLSDKKVGTYVEV 

DMFGLPVDTRRKAFKTKTSQGNAVNPVWEEEPI 

VFKKWLPTLACLRIAVYEEGGKFIGHRILPVQAI 

RPGYHYICLRNERNQPLTLPAVFVYIEVKDYVPD 

TYADVIEALSNPIRYVNLMEQRAKQLAALTLEDE 

EEVKKEADPGETPSEAPSEARTTPAENGVNHTTT 

LTPKPPSQALHSQPAPGSVKAPAKTEDLIQSVLTE 

VEAQTIEELKQQKSFVKLQKXHYKEMKDLVKR 

HHKKTTDLIKEHTTKWEIQNDYLRRRAALEKS 

AKKDSKKKSEPSSPDHGSSTIEQDLAALDAEMTQ 

KLIDLKDKQQQQLLNLRQEQYYSEKYQICREHIK 

LLIQKLTDVAEECQNNQLKKLKEICEKEKKELKK 

QEWQYIKRLEEAQSKRQEKLVEKHKEIRQQILD 
EKPKLQVELEQEYQDKFKRLPLEDLEFVQEAMKG 
KISEDSNHGSAPLSLSSDPGKVNHKTPSSEELGGD 
IPGKEFDTPL 


JOXf 


A 


1 




ILri W r V rlK W 6 UKJN IN Kn-JsJ \j VnVUrJibllJNMiir Y 

CCRETLKSLRPECFIYDLSAVVMHHGKGFGSGH 

YTAYCYNSEGGFWVHCNDSKLSMCTMDEVCKA 

QAYILFYTQRVTENGHSKLLPPELLLGSQHPNED 


3825 


A 


3 - - ■ 


364 


GIRAKF?NKIPVVVERYPRETFLPPLDKTKFLVPQ 
ELTMTQFLSIIRSRMVLRATEAFYLLVNNKSLVS 
MSATMAEIYRDYKDEDGFVYMTYASQETFGCLE 
SAAPRDGSSLEDRPLHPL 


3826 


A 


1 . 


1237 


PEKKFERECREAEKAQQSYERLDNDTNATKADV 

EKAKQQLNLRTHMADENKNEYAAQLQNFNGEQ 

HKHFYWIPQIYKQLQEMDERRTIKLSECYRGFA 

DSERKVIPIISKCLEGMILAAKS\nDERRDSQMVV 

DSFKSGFEPPGDFPFEDYSQHIYRTISDGTISASKQ 

ESGKMDAKTTVGKAKGKLWLFGKKPKGPALED 

FSHLPPEQRRKKLQQRIDELNRELQKESDQKDAL 

NKMKDVYEKNPQMGDPGSLQPKLAJBTMNNTOR 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location ' 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 

nucleotide 

location 

corresponding ' 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E<=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, LHLeucine, M=Methionine, 
N=Asparagine, P^Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, ^Stop codon, /^possible nucleotide deletion, 
V=possibIe nucleotide insertion 










LRMEIHKNEAWLSEVEGKTGGRGDRRHSSDENH 
LVTQGRESPEGSYTDDANQEVRGPPQQHGHHNE 
FDDEFEDDDPLPAIGHCKAIYPFDGHNEGTLAMK 
EGEVLYIffiEDKGDGWTRARRQNGEEGYVPTSYI 
DVTLEKNSKGS 


3827 


A 


2 


1584 


INPVSSAVNGEAHSSHETRGQNSNALPSVLLELL 

SQSCLIPAMSSYLRNDSVLDMARHVPLYRALLEL 

LRAIASCAAMVPLLLPLSTENGEEEEEQSECQTS 

VGTLLAKMKTCVDTYTNRLRSKRENVKTGVKP 

DASDQEPEGLTLLVPDIQKTAEIVYAATTSLRQA 

NQEKKLGEYSKKAAMICPKPLSVLKSLEEKYVAV 

MKKLQFDTFEMVSEDEDGKLGFKVNYHYMSQV 

KNANDANSAARARRLAQEAVTLSTSLPLSSSSSV 

FVRCDEERLD3MKVLITGPADTPYANGCFEFDVY 

FPQDYPSSPPLVNLETTGGHSVRFNPNLYNDGKV 

CLSILNTWHGRPEEKWNPQTSSFLQVLVSVQSLI 

LVAEPYFNEPGYERSRGTPSGTQSSREYDGNIRQ 

ATVKWAMLEQIRlslPSPCFKEVIHKHFYLKRVEIM 

AQCEEWIADIQQYSSDKRVGRTMSHHAAALKRH 

TAQLREELLKLPCPEGLDPDTDDAPEVCRATTGA 

EETLMHDQVKPSSSKELPSDFQL 


3828 


A 


1415 


845 

' 


PRVPATLVSLDPWHCFPTAGRLAGSTWVPPACT 

LQLGPSSEHELDNHRAPLLSLPSQESLSFTPWYLV 

ACKPLFHIFCPLFACFMQEGKVQYLFLHLSHMRL 

LNYYFFPFLAPESLMQALEDLDYLAALDNDGNL 

SEFGIIMSEFPLDPQLSKSILASCEFDCVDEVLTIA 

AMVTGILNDYSFSFFANLH 


3o29 


A 


199 


683 


VDHTPVLSKPQCFSSVKWGATLSARSQKTSGIGR 
LMVHVIEATELKACKPNGKSNPYCE1SMGSQSYT 
TRTIQDTLNPKWNFNCQFFIKDLYQDVLCLTLFD 
RDQFSPDDFLGRTEEPVAKIRTEQESKGPMTRRLL 
LHEVPTGEVWVRFDLQLFEQKTLL 


3830 


A 


1747 


404 


RKMMEESGIETTPPGTPPPNPAGLAATAMSSTPV 

PLAATSSFSSPNVSSMESFPPLAYSTPQPPLPPVRP 

SAPLPFVPPPAVPSVPPLVTSMPPPVSPSTAAAFG 

NPPVSHFPPSTSAPNTLLPAPPSGPPISGFSVGSTY . 

. DITRGHAGRAPQTPLMPSFSAPSGTGLLPTPITQQ 

ASLTSLAQGTGTTSAITFPEEQEDPRITRGQDEAS 

AGGIWGFIKGVAGNPMVKSVLDKTKHSVESMIT 

TLDPGMAPYDCSGGELDIVVTSNKEVKVAAVRD ' 

AFQEVFGLAVWGEAGQSNIAPQPVGYAAGLKG 

AQERIDSLRRTGVIHEKQTAVSVENFIAELLPDK 

WFDIGCLWEDPVHGIHLETFTQA7PVPLEFVQQ 

AQSLTPQDYNLRWSGLLVTVGEVLEKSLLNVSR 

TDWHMAFTGMSRRQMIYSAARAJAGMYKQRLP 

PRTV 


3831 


A 


5 


674 


FWTRSAWHEGLQQ3S4KANDPSLQEVNLYNIKNIP 
IPTLREFAKALETNTHVKKFSLAATRSNDPVAIAF 
ADMLKWTTLTSLNIESHFITGTGILALVEALKEN 
DTLTEIKIDNQRQQLGTAVEMEIAQMLEENSRIL 

EQTSIWQWSQSIAGFNPQFEVQGQNARSWMEE 
LGKAFHQFVRRELKQTEGKLP 


3832 


A 


164 


782 


EPWVPMDVAESPERDPHSPEDEEQPQGLSDDDIL 
RDSGSDQDLDGAGVRASDLEDEESAARGPSQEE 
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SEQ n> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A*=AIaninc C=Cysteine, D^Aspartic Acid, 
E=G1utamic Acid, ^Phenylalanine, G=Glycine, HNHistidine, 
I=IsoIcucine, K>=Lysine, L=Leucine, MNMetnionine, 
N^Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S^Serine, 
T=Threoninc, V=Valine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










bDNHSDEBDKASliFKSQDQDSEVNELSRGPTSSP 

CEEEGDEGEEDRTSDLRDEASSVTRELDEHELDY 

DEEVPEEPAPAVQEDEAEKAGAEDDEEKGEGTP 

REEGKAGVQSVGEKESLEAAKEKKKEDDDGEID 

DEEMY 


3833 


A 


122 


1676 


SQPPHFTQKMNENKDTDSKKSEEYEDDFEKDLE 

WLINENEKSDASIIEMACEKEENINQDLKENETV 

MEHTTCRHSDPDKSLQDEVSPRRNDIISVPGIQPLD 

PISDSDSENSFQESKLESQKDLEEEEDEEVRRYIM 

EKIVQAT^LQNQEP\^KRERKLKJ1CDQLVDL 

EVPPLEDTTTSKNYFENERNMFGKLSQLCISNDF 

GQEDVLLSLTNGSCEENKDRTILVERDGKFELLN 

LQDIASQGFLPPDSfNANSTENDPQQLLPRSSNSSV 

SGTKKEDSTAKfflAVTHSSTGEPLAYIAQPPLNR 

KTCPSSAVNSDRSKGNGKSNHRTQSAHISPVTST 

YCLSPRQKELQKQLEEKREKLKREEERRKIEEEK 

EKKRENDIVFKAWLQKKEIEQVLEMRRIQRAICEI 

EDMNSRQENRDPQQAFRLWLKKKHEEQMKERQ 

TEELRKQEECLFFLKGTEGRERAFKQWLRRKRM 

bruVOAbC^A VKbK rRQLRLEAKRSKQLQHHL YM 

SEAKPFRFTDHYN 


3834 


A 


575- 


774 


RSRTEELSNSGILKAMSKDLVTFGDVAVNFSQEE 
WEWLNPAQRNLYRKVMLENYRSLVSLGKDMSP 


3835 


A ■ 


2 


100 


ASDFYLRYYVGHKGKFGHEFLEFEFRPDGVYV 


3836 


A 


91 


749 


RPTPGHGDFWMQPLTKDAGMSLSSVTLASALQV 

RGEALSEEEIWSLLFLAAEQLLEDLRNDSSDYVV 

CPWSALLSAAGSLSFQGRVSHIEAAPFKAPELLQ 

GQSEDEQPDASQMHVYSLGMTLYWSAGFHVPP . 

HQPLQLCEPLHSILLTMCEDQPHRRCTLQSVLEA 

CRVHEKEVSVYPAPAGLHIRRLVGLVLGTISEVS 

REPCFSSSSCWSCVAIKI 


3837 


A 


3 


1214 


SLGCTOSARGKGQDDEVRTLMANGAPFTTOWFS 

KLRVSCGYIGDNCKNGADVNAKDMLKMTALH 

WATERHHRDVVELLIKYGADVHAFSKFDKSAFD 

IALEKNNAEIL VE.QEAMQNQVNVNPERANPVTD 

PVSMAAPFIFTSGEVVNLASLISSTNTKTTSGDPH 

ASTVQFSNSTTSVLATLAALAEASVPLSNSHRAT 

ANTEEnEGNSVDSSIQQVMGSGGQRVTTIVTDGV 

PLGMQTSIPTGGIGHPFIVTVQDGQQVLTVPAGK 

VAEETVnCEEEEEKLPLTKKPRIGEKTNSVEESKE 

GNERELLQQQLQEANRRAQEYRHQLLKKEQEAE 

QYRLKLEAIARQQPNGVDFTMVEEVAEVDAW 

VTEGELEERETKVTGSAGATGPPTRVSMATVSS 


3838 


A 


1 


1332 


MIEDNKENKDHSLERGRASLIFSLKNEVGGLIKA 

LKIFQEKHVNLLHIESRKSKRRNSEFEIFVDCDIN 

REQLNDIFHLLKSHTNVL S VNLPDNFTLKEDGME 

TVPWFPKKISDLDHCANRVLMYGSELDADHPGF 

KDNVYRKRRKYFADLAI^NYKHGDPPKVEFTEE 

EIKTWGTWQELNKLYPTHACREYLKNLPLLSKY 

CGYREDNIPQLEDVSNFLKERTGFSIRPVAGYLSP 

RTYFT 9frT APP VPPPTnWT?TT^^r>PT7VTPT3PnTr , t T 
isxj r i-/r\r i\ v r nv^ 1 1^/ 1 yjsxxooUrri UrJorUlOrl 

ELLGHVPLLAEPSFAQFSQEIGLASLGASEEAVQ 

KLATCYFFTVEFGLCKQDGQLRVFGAGLLSSISE 

LKHALSGHAKVKPFDPKITCKQECLITTFQDVYF 

VSESFEDAKEKMREFTKTIKRPFGVKYNPYTRSI 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

IUC31IUI1 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 

nucleotide 

location 

Lorr cspunuing 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=lsoleucinc, K^Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q^Glutamine, R— Arginine, S^Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=TJnknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










QILKDTKSITSAMNELQHDLDVVSDALAKVSRKP 
SI 


3839 


A 


3093 


520 


MVNFTVDQIRACvmKKANIRI^ 

TLTDSLVCKAGIIASARAGETRFTDTRKDEQERCI 

TIKSTAISLFYELSENDLNFIKQSKDGAGFLINLED 

SPGHVDFSSEVTAALRVTDGALVWDCVSGVCV 

QTETVLRQAIAERIKPVLMMNKMDRALLELQLE 

PEELYQTFQRIVENVNVnSTYGEGESGPMGNIMI 

DPVLGWGFGSGLHGWAFTLKQFAEMYVAKFA 

AKGEGQLGPAERAKKVEDMMKKLWGDRYFDP 

ANGBCFSKSATSPEGKKLPRTFCQLILDPIFKVFDA 

IMNFKKEETAKLIEKLDIKLDSEDKDKEGKPLLK 

AVMRRWLPAGDALLQMITIHLPSPVTAQKYRCE 

LLYEGPPDDEAAMGIKSCDPKGPLMlvmSKMVP 

TSDKGRFYAFGRVFSGLVSTGLKVRJMGPNYTPG 

KKEDLYLKP1QRTILMMGRY VEPIEDVPCGNIVG 

LVGVDQFLVKTGTITTreHAHNMRVMKFSVSPV 

VRVAVEAKNPADLPKLVEGLKRLAKSDPMVQCI 

IEESGEHIIAGAGELHLEICLKDLEEDHACIPIKKS 

DPVVSYRETVSEESmO-CLSKSPNBGtlNRLYMKA 

RPFPDGLAEDIDKGEVSARQELKQRARYLAEKY 

EWDVAEARKIWCFGPDGTGPNILTDITKGVQYL . 

NEIKDSWAGFQWATKEGALCEENMRGVRFDV 

RLMEPIYLVE1QCPEQVVGGIYGVLNRKRGHVFE 
ESQVAGTPMFVVKAYLPVNESFGFTADLRSNTG 
GQAFPQCVFDHWQILPGDPFDNSSRPSQWAETR 
KRKGLKEGIPALDKFLDKL 


3840 


A 


2 


753 


SSTRSRDFCCSEAIQGSLTRRERRASGVRTRRSQG 
SSAMASKILLNVQEEVTCPICLELLTEPLSLDCGH 
SLCRACITVSNKEAVTSMGGKSSCPVCGISYSFE 

T-TT O A "MOT-IT A*MT\7PPT VP\rVT C"DF\'\Tm/"'Ii r T? T\X Pnu 

HGEKLLLFCKEDRKVICWLCERSQEHRGHHTVL 
TEEWKECQEKLQAVLKRLKKEEEEAEKLEADIR 
EEKTSWKYQVQTERQRIQTEFDQLRSILNNEEQR 
ELQRLEEEEKKT 


3841 


A 


2 




HFGNLKVHERIHTGEKPYECKECRKAFSWLTCL 
LRHERIHTGKXSYECQQCGKAFTRSRFLRGHEKT 

L 


'3842 


A 


311 


88 


. AVLKNMAPMTALGLLDLHILNLILFLSAGEDFTS 
WSEIMMYILLVFLTLWLLIEMIYCYRKVSKAEE 
AAQENA 


3843 


A . 


3- . 


1175 


APERNSRIDDFVRRVESKATSARCGLWGSGPRRR 

PASGMFRGLSSWLGLQQPVAGGGQPNGDAPPEQ 

PSETVAESAEEELQQAGDQELLHQAKDFGNYLF 

NFASAATKKITESVAETAQTIKKSVEEGKIDGIID 

KTTIGDFQKEQKKFVEEQHTKKSEAAVPPWVDT 

NDEETIQQQILALSADKRNFLRDPPAGVQFNFDF . 

DQMYPVALVMLQEDELLSKMRFALVPKLVKEE 

VFWRNYFYRVSLIKQSAQLTALAAQQQAAGKEE 

KSNGREQDLPLAEAVRPKTPPVVIKSQLKTQEDE 

EEISTSPGVSEFVSDAFDACNLNQEDLRKEMEQL • 

VLDKKQEETAVLEEDSADWEKELQQELQEYEV 
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SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence , 


Amino acid sequence (A=Alanine C=Cysteine, D=*Aspartic Acid, 
E^GJutamic Acid, F=PhenylaInntnc, G=G (ycine, H=Histidine, 
I-Isoleucine, K-Lysine, L^Leucine, M^Methionine, 
N=Aspdragine, P^Proline, Q=Glutaminc, R=Arginine, S=Serine, 
1=1 nreomne, \~\2Unt, W=Tryptopban, Y^Tyrosine, 
X=Unknown, *=Stop cod on, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










VTESEKRDENWDKEIEKMLQEEN 


3844 


A 


798 


148 


LPPAQIPEAWLLLANWVVLILVPLKDRLIDPLLL 
. RCKLLPSALQKMALGMFFGFTS VIVAGVLEMER 
LHYIHHNETVSQQIGEVLYNAAPLSIWWQIPQYL 
LIGISEIFASIPGLEFAySEAPRSMQGAIMGIFFCLS 
GVGSLLGSSLVALLSLPGGWLHCPKDFGNINNCR 
MDLYFFLLAGIQAVTALLFVWIAGRYERASQGP 
ASHSRFSRDRG 


.3845 


A ; 


3 


1934 


PEDSAPQYSRLFPNASQHITPSYNYAPNPDKHWI 

MRYTGPMKPIHMEFTNMLQRKRLQTLMSVDDS 

MET1YNMLVETGELDNTYIVYTADHGYHIGQFG 

LVKGKSMPYEFDIRVPFYVRGPNVEAGCLNPHIV 

LNIDLAPTILDIAGLDPADMDGKSILKLLDTERP 

VNRFHLKKKMRYWRDSFLX^RGKLLHKRDNDK 

VDAQEENFLPKYQRVKDLCQRAEYQTACEQLG 

QKWQCVEDATGKLKLHKCKGPMRLGGSRALSN 

LVPKYYGQGSEACTCDSGDYKLSLAGRRKKLFK 

KKYKASYVRSRSIRSVAIEVDGRVYHVGLGDAA 

QPRNLTKRHWPGAPEDQDDKDGGDFSGTGGLP 

DYSAANPIKVTHRCYILENDTVQCDLDLYKSLQ 

AWKDHKXHIDHEIETLQNKIKKLREVRGHLK^ 

RPEECD CHKI S YHTQHKG RLKHRG S S LHPFRKGL 

QEKDKVWLLREQKRKKKLRKLLKRLQNNDTCS 

MPGLTCFTHDNQHWQTAPFWTLGPFCACTSAN 

NNTYWCMRTINETHNFLFCEFATGFLEYFDLNT 

DPYQLMNAVNTLDRDVLNQLHVQLMELRSCKG 

YKQCNPRTRNMDLGLKDGGSYEQYRQFQRRKW 

PEMKRPSSKSLGQLWEGWEG 


3846 


A 


3 


1934 


. PEDSAPQYSRLFPNASQHITPSYNYAPNPDKHWI 
MRYTGPMKPIHMEFTNMLQRKRLQTLMSVDDS 
METTiTMLVETGELDNTYIVYTADHGYHIGQFG 
LVKGKSMPYEFDIRVPFYVRGPNVEAGCLNPHIV 
LNIDLAFnLDIAGLDIPADMDGKSELKLLDTERP 
VNRFHLKKKMRVWRDSFLVERGKLLflKRDNDK 
VDAQEENFLPKYQRVKDLCQRAEYQTACEQLG 
QKWQCVEDATGKLKLHKCKGPMRLGGSRALSN 
LVPKYYGQGSEACTCDSGDYKLSLAGRRKKLFK 
KKYKASYVRSRSIRSVAIEVDGRVYHVGLGDAA 
QPRNLTKRHWPGAPEDQDDKDGGDFSGTGGLP 
DYSAANPIKVTHRCYILENDTVQCDLDLYKSLQ 
AWKDHKXHIDHEffiTLQNKIK^ 
RPEECDCHKISYHTQHKGRLKHRGSSLHPFRKGL 
QEKDKVWLLREQKRKKKLRKLLKRLQNNDTCS 
MPGLTCFTHDNQHWQTAPFWTLGPFCACTSAN 
NNTYWCMRTINETHNFLFCEFATGFLEYFDLNT - 
DPYQLMNAVNTLDRDVLNQLHVQLMELRSCKG 
\TCQCNPRTRNMDLGLKDGGSYEQYRQFQRRKW 
PEMKRPSSKSLGQLWEGWEG 


3847 


A 


1 


1257 • 


MWSAVLTAFHTGTSNTTFVYYENTYMNITLPPP 
FQHPDLSPLLRYSFETMAPTGLSSLTVNSTAVPTT 
PAArKSLNLPLQlTLSAIMIFILFVSFLGNLVVCLM 
VYQKAAMRSAINILLASLAFADMLLAVLNMPFA 
LVTILTTRWIFGKFFCRVSAMFFWLFVIEGVAILL 
IISIDRFLirVQRQDKLNPYRAKVLIAVSWATSFCV 
AFPLAVGNPDLQIPSRAPQCVFGYTTNPGYQAYV 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
lucauuu 
- corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
. nucleotide 
location 
lu rres p u n a i ng 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, J>~As par tic Acid, 
E-=Glutamic Acid, F=PhcnylaIanine, G=Giycine, H=Histidine, 
I-Isolcucinc, K=Lysine, L^Leucine, M=Methi oniric, 
N=Asparagine, P^Proline, Q^GIutamirie, R^Argiuinc, S^Scrine, 
T=Threonine, V=Va!ine, W^Tryptophan, Y-Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










]LISL1SFFIPFLVILYSFMGILNTLRHNALRIHSYPE 

ILILFAWIVCWAPFTTYSLVATFSKHFYYQHNFF 
EISTWLLWLCYLKSALNPLIYYWRIKKFHDACLD 
MMPKSFKFLPQLPGHTKRRIRPSAVYVCGEHRT 
W 


3848 


A 


3 


2827 


SSAVAARRRRSWASLVLAFLGVCLGITLAVDRS 

NFKTCEESSFCKRQRSIRPGLSPYRALLDSLQLGP 

DSLTVHLIHEVTKVLLVLELQGLQKNMTRFRIDE 

LEPRRPRYRVPDVLVADPPIARLSVSGRDENSVE 

LtMAEGPYKIILTAKPFRLDLLEDRSLLLSVNARG 

LLEFEHQRAPRVSQGSKDPAEGDGAQPEETPRD . 

GDKPEETQGKAEKDEPGAWEETFKTHSDSKPYG 

PMSVGLDFSLPGMEHWGIPEHADNLRLKVTEG 

GEPYRLYNLDVFQYELYNPMALYGSVPVLLAHN 

PHRDLGIF\^NAAETWVDISSNTAGKTLFGKMM 

DYLQGSGETPQTDVRWMSETGIIDVFLLLGPSISD 

VFRQYASLTGTQALPPLFSLGYHQSRWNYRDEA 

DVLEVDQGFDDHNLPCDVIWLDIEHADGKRYFT 

WDPSRFPQPRTMLERLASKRRKLVAIVDPHIKVD 

SGYRVHEELRNLGLYVKTRDGSDYEGWCWPGS 

AGYPDFTNPTMRA WWANMFSYDNYEGSAPNLF 

VWNDMREPSVFNGPEVTMLKDAQHYGGWEHR 

DVHNIYGLYVHMATADGLRQRSGGMERPFVLA 

RAFFAGSQRFGAVWTGDNTAEWDHLKISIPMCL 

SLGLVGLSFCGADVGGFFKNPEPELLVRWYQMG 

AYQPFFRAHAHLDTGRREPWLLPSQHNDIIRDAL 

GQRYSLLPFWYTLL YQAHREG1P VMRPL WVQ YP 

QDVTIPNIDDQYLLGDALLVHPVSDSGAHGVQV 

YLPGQGEVWYDIQSYQKHHGPQTLYLPVTLSS3P 

VFm?OOTTVP*R WXyTRVPPQQPPTV^T^rinPTTT T?VAT C 
v r v x /i\a_jvj 1 1 v Jrrv w iviix. v i\j\.ooJDV^ivijvJUJLJ.r 1 1 JLr v /\.i_,o 

PQGTAQGELFLDDGHTFNYQTRQEFLLRRFSFSG 

NTLVSSSADPEGHFETPIWIERVVIIGAGKPAAVV 

LQTKGSPESRLSFQHDPETS VLVLRKPGINVASD 

WSIHLR 


.3849 


A 


1 


1717 


RARNARGCWGVCRSGFSSAVCGAARMEQVAEG 

ARVTAVPVSAADSTEELAEVEEGVGWGEDNDA 

AARGAEAFGDSEEDGEDVFEVEiaLDMKTEGGK 

VLYKVRWKG YTSDDDTWEPE1HLEDCKEVLLEF 

RKK1AENKAKAVRKDIQRLSLNNDIFEANSDSDQ 

QSETKEDTSPKKKXKKLRQREEKSPDDLKKKKA 

KAGKLKDKSKPDLESSLESLVFDLRTKKRISEAK 

EELKESKKPKKDEVKETKELKKVKKGEIRDLKT 

KTREDPKENRKTKKEKFVESQVESESSVLNDSPF 

PEDDSEGLHSDSREEKQNTKSARERAGQDMGLE 

HGFEKPLDSAMSAEEDTDVRGRRKKKTPRKAED 

TRENRKLENKNAFLEKKTVPKKQRNQDRSKSAA 

ELEKLMPVSAQTPKGRRLSGEERGLWSTDSAEE 

DKETKR^SKKPKKDEVKETKELKKVKKGEIRD 

LKTKTREDPKENRKTKKEKFVESQVESESSVLND 

SPFPEDDSEGLHSDSREEKQNTKSARERAGQDM 

GLEHGFEKPLDSAMSAEEDTDVRGRRKKKTPRK 

AEDTRENRICLENKNAFLEKXTVPKKQRNQDRSK 

SAAELEKLMPVSAQTPKGRRLSGEERGLWSTDS 

AEEDKETKRNESKKPKKDEVKETKELKKVKKGE 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
' corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc C=Cystcinc, D^Aspartic Acid, 
E=Glutamic Acid, F=Pbenyl alanine, G»Glycine, H=Histidine, 
I=lsoleucine, K-JLysine, L=Leucine, M=Methioaine, 
N^Asparagine, P^Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown t *=Stop codon, ^possible nucleotide deletion, 
V=possiblc nucleotide insertion 










IRDLKTKTREDPKENRKTKKEKFVESQ VESES S V 
LNDSPFPED/RQ*RATFRQQREEKSPDDLKKKKA 
KAGKLKDKSKPDLESSLESLVFDLRTKKRISEAK 
EELKESKKPK 


3850 


A 


1113 


3975 


PAAAAAAAAAAAAAAGRGPSFTPCFSPSLAVEPS 

RRTRLGSDPAQAMAGNVKKSSGAGGGSGSGGS 

GSGGLIGLMKDAFQPHHHHHHHLSPHPPGTVDK 

KMVEKCWKLMDKVVRLCQNPKLALKNSPPYIL 

DLLPDTYQHLRTILSRYEGKMETLGENEYFRVF 

MENLMKKTKQTISLFKEGKERMYEENSQPRRNL 

TKLSLIFSHMLAELKGIFPSGLFQGDTFRITKADA 

AEFWRKAFGEKTIVPWKSFRQALHEVHPISSGLE 

AMALKSTIDLTCNDYISVFEFDIFTRLFQPWSSLL 

R^TVVNSLAVTHPGYMAFLTYDEVKARLQKFIHKP 

GSYIFRLSCTRLGQWAIGYVTADGNILQTIPHNKP 

LFQALEDGFREGFYLFPDGRNQNPDLTGLCEPTP 

QDHIKVTQEQYELYCEMGSTFQLCKICAENDKD 

VKIEPCGHLMCTSCLTSWQESEGQGCPFCRCEIK 

GTEPIWDPFDPRG SGSLLRQG AEGAPSPNYDDD 

DDERADDTLFMMKELAGAKVERPPSPFSMAPQA 

SLPPVPPRLDLLPQRVCVPSSASALGTASKAASGS 

LHKDKPLPVPPTLRDLPPPPPPDRPYSVGAESRPQ 

RRPLPCTPGDCPSRDKLPPVPSSRLGDSWLPRPIP 

KVPVSAPSSSDPWTGRELTNRHSLPFSLPSQMEP 

RPDVPRLGSTFSLDTSMSMNS SPLV GPECDHPKI 

KPSSSANAIYSLAARPLPVPKLPPGEQCEGEEDTE 

YMTPSSRPLRPLDTSQSSRACDCDQQIDSCTYEA 

MYMQSQAPSITESSTFGEGNLAAAHANTGPEES 

ENbDDGYDVPKPPVPAVLARRTLSDISNASSS/FG 

LFVLERDP*PQNVTEGSQVPERPPKPFPRRINSER 

KAGSCQQGSGPAASAATAVSPQLSSEIENLMSQG 

YSYQDIQKALVIAQNNIEMAKNILREFVSISSPAH 

VAT 


3851 


A --. 


2 


2781 


GRVGSMDGAMGPRGLLLCMYLVSLLILQAMPA 

LGSATGRSKSSEKRQAVDTAVDGVFIRSLKVNC 

KVTSRFAHYVVTSQWNTANEAREVAFDLEIPK 

TAFISDFAVTADGNAFIGDIKDKVTAWKQYRKA 

AISGENAGLVRASGRTMEQFTIHLTVNPQSKVTF 

QLTYEEVLKRNHMQYEIVIKVKPKQLVHHFEIDV 

DEFEPQGISKLDAQASFLPKELAAQTIKKSFSGICK 

GHVLFRPTVSQQQSCPTCSTSLLNGHFKVTYDVS 

RDIGCDLLVANNHFAHFFAPQNLTNMNKNVVFV 

IDISGSMRGQKVKQTKEALLBOLGDMQPGDYFD 

LVLFGTRVQSWKGSLVQASEANLQAAQDFVRGF 

SLDEATNLNGGLLRGIEILNQVQESLPELSNHASI 

LIMLTDGDPTEGVTDRSQILKNVRNAIRGRFPLY 

NLGFGHNVDFNFLEVMSMENNGRAQRIYEDHD 

ATQQLQGFYSQVAKPLLVDVDLQYPQDAVLALT 

QNHHKQYYEGSEIVVAGR1ADNKQSSFKADVQA 

HGEGQEFSITCLVDEEEMKKLLRERGHMLENHV 

MSLDYGFVTPLTSMSIRGMADQDGLKPTIDKPSE 
DSPPLEMLGPRRTFVLSALQPSPTHSSSNTQRLPD 
RVTGVDTDPHFIIHVPQKEDTLCFNINEEPGVILS 
LVQDPNTGFSVNGQLIGNKARSPGQHDGTYFGR 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Aianinc C^Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, KHLysine, L=Lcucinc, M=Methionine, 
N=Asparagine, P=Proline, Q=Glu famine, R=Argininc, S=Serine, 
T=Thrconinc, V=Valine, W^Tryptophan, Y^Tyrosine, 
X»Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possibIe nucleotide insertion 










LGIANPATDFQLEVTPQNIILNPGFGGPVFSWRD 

VjAVLKl^JLKj V V V 1 INJsJsJunLV Vb VUUuCj lr\bVV\ 

LHRVW\KGSS\VHQDFLGLLMCWDKSTGMSSPGR 
KGCWGQ\FFHPIRFLKVS*HPPPGSDPQKAQMPT 
MVVRNPPGLTVT\RGLQKDYSKDPWHGAEVSC 
WFI\HNNGA*I\TDCAYTDYI\VPDEF 


3852 


A 


39 


1735 


TQVAEAGRGEGVVAGAETGRPQSAGMNLELLES 

FGQNYPEEADGTLDCISMALTCTFNRWGTLLAV 

GCNDGRIVIW\DF^TRGIA*NKFSAHIHPVCSLC 

WSRDGHKLVSASTDNIVSQWDVLSGDCDQRFRF 

PSPILKVQYHPRDQNKVLVCPMKSAPVMLTLSD . 

SKHVVLPVDDDSDLNVVASFDRRGEYIYTGNAK 

GKILVLKTDSQDLVASFRVTTGTSNTTAIKSffiFA 

RKGSCFLINTADRIIRVYDGREILTCGRDGEPEPM 

QKLQDLVNRTPWKKCCFSGDGEYIVAGSARQH 

ALYIM^KSIGNLVICILHGTRGELLIJDVAWHPVRP 

IIASISSGWSIWAQNQVENWSAFAPDFKELDEN 

VEYEERESEFDIEDEDKSEPEQTGADAAEDEEVD 

V 1 bVDFlAAt C6SDJbhLED£>KALLYLrlAPE VEDP 

EENPYGPPPDAVQTSLMDEGASSEKKRQSSADG 

SQPPKKKPKTTNIELQGVPNDEVHPLLGVKGDG . 

KSKKKQAGRPKGSKGKEKDSPFKPKLYKGDRGL 

PLEGSAKGKVQAELSQPLTAGGAISELL 


3853 


A 


45 


2603 


PLLFTCGREVRARDPEKEGT1VVAGLKVQVQPRF 

LWILCFSMEETQGELTSSCGSKTMANVSLAFRDV 

SIDLSQEE WECLDA VQRDLYKDVMLENYSNLVS . 

LDLEYKYITKNLLSEK^fVCKJYLSQLQTGEKSKN 

TIHEDTIFRNGLQCKHEFERQERHQMGCVSQMLI 

QKQISHPLHPKIHAREKSYECKECRKAFRQQSYLI 

QHLRIHTGERPYKCMECGKAFCRVGDLRVHHTI 

HAGERPYECKECGKAFRLHYHLTEHQRIHSGVK 

PYECKECGICAFSRVRDLRVHQT1HAGERPYECK 

ECGKAFRLHYQLTEHQRIHTGERPYECKVCGKT 

FRVQRHISQHQKIHTGVKPYKCNECGKAFSHGS 

YLVQHQKIHTGEKPYECKECGKSFSFHAELARH 

RRIHTGEKPYECRECGKAFRLQTELTRHHRTHTG 

EKPYECKECGKAFICGYQLTLHLRTHTGEIPYEC 

KECGKTFSSRYHLTQHYRIHTGEKPYICNECGKA 

FRLQGELTRHHRIHTCEKPYECKECGKAFIHSNQ 

FISHQRIHTSESTYICKECGKIFSRRYNLTQHFK1H 

TGEKPYICNECGKAFRFQTELTQHHRIHTGEKPY 

KCTECGKAFIRSTHLTQHHRJHTGEKPYECTECG 

KTFSRHYHLTQHHRGHTGEKPYICNECGNAFICS 

YRLTLHQRIHTGELPYECKECGKTFSRRYHLTQH 

r KLri 1 CrblsJP Y bCKJEOGN ArRLQAELTRHHIVHTG 

EKPYKCKECGKAFSVNSELTRHHRIHTGEBCPYQC 

KECGKAFIRSDQLTLHQ\KIILVR\NPMHNVKRIR 

WPLENAL*QRICNLRNFLFVTEHVG1PFTSCSQFI 

RNYFVC 


3854 


A 


108 


894 


LQSCWVPGIPWPSVGWLSWLKDLPSCEIHSASLS 

r\ V JU^wr^^O-DlVLL, WJtjSJNJj 1 o W UUooo V OOVJloL/ 1 1 

DNLSTDD1NTSSSISSYANTPASSRKNLDVQTDAE 
KHSQVERNSLWSGDDVKKSDGGSDSGIKMEPGS 
KWRRNPSDVSDESDKSTSGKKNPVISQTGSWRR 
GMTAQVGITMPRTKASAPAGALKTPGTGKRPGL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C-CysttinCy D=Aspartic Acid, 
£=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, JL^Leucine, M*=Methionine, 
N=Asparagine, P^Proline, Q=Outarainc, R=Arginint, S^Serine, 
TVThreonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=tJn known, * es Stop codon, /=possible nucleotide deletion, 
possible nucleotide insertion . 










S\GPGAPTPAAPPQLARMAWAFSLSAASTPAVSP 
STSPSAVEGSPATILPLASSPPPRTTP*LPLSELTV* 
RPQELVRGRGCLGPGAPTPAAPPQLARMAWAFS 
LSAASTPAVSPSTSPSAVEGSPATILPLASSPPPRT 
TP 


3855 


A 


1 


772 


FRGGDGAPGVLKPGNPLPFPLPPLQYPPPSTLSHS 

DNLAMTSRSTARPNGQPQASKICQFKLVLLGESA 

VGKSSLVLRFVKGQFHEYQESTIGAAFLTQSVCL 

DDTTVKFEIWDTAGQERYHSLAPMYYRGAQAAI 

VVYDITNQE1TARAKTWVKELQRQASP\SIVVGL 

AGNKADLANKRMVEYEEAQAYADDNSLLFMET 

SAKTAMNVNDLFL\AIA*EVAKRVNPQNLG\G\A 

AGRSRGVDLHEQS\QQNKSQCCSN 


3856 


A 


2815 


352 


LGLEAAARPRPGGPAAMQDGNFLLSALQPEAGV 

CSLALPSDLQLDRRGAEGPEAERLRAARVQEQV 

RARLLQLGQQPRHNGAAEPEPEAETARGTSRGQ 

YHTLQAGFSSRSQGLSGDKTSGFRPIAKPAYSPA 

SWSSRSAVDLSCSRRLSSAHNGGSAFGAAGYGG 

AQPTPPMPTRPVSFHERGGVGSRADYDTLSLRSL 

RLGPGGLDDRYSLVSEQLEPAATSTYRAFAYER 

QASSSSSRAGGLDWPEATEVSPSRTIRAPAVRTL 

QRFQSSHRSRGVGGAVPGAVLEPVARAPSVRSLS 

LSLADSGHLPDVHGFNSYGSHRTLQRLSSGFDDI 

DLPSAVKYLMASDPNLQVLGAAYIQHKCYSDAA 

AKKQARSLQAVPRLVKLFNHANQEVQRHATGA 

MRNLIYDNADNKLALVEENGIFELLRTLREQDDE 

LRKNVTGILWNLSSSDHLKDRLAKKTPLE\QLT\D 

LGV*APLSGAGGPP\LIQQNASEAEIFYNATGFPR 

NLSSASQATRQKMRECHGLVDALVTSINHALDA 

GKCEDKSVENAVCVLRNLSYRLYDEMPPSALQR 

LEGRGRRDLAGAPPGEVVGCFTPQSRRLRELPLA 

ADALTFAEVSKDPKGLEWLWSPQIVGLYNRLLQ 

RCELNRHTTEAAAGALQNITGG\DPRGPGGLSRL 

ALEQERILNPLLDRVRTADHHQLRSLTGLIRNLS 

RNARNKDEMSTKVV\SHLI\EKLPGSVGEKSPPAE 

VLV\NI\IAVFNNLGWLASPI/ALARDLLYFDGLRK 

LIFIKKKRDSPDSEKSSRAASSLLANLWQYNKLH 

RDFRAKGYRXEDFLGP 


3857 


A 


1034 


204 


VAVTLLSQLPSAIQRTAAWEMRAPLTFRVPLALD 

LIKPEHCTVNVDNSLSIPVIAAELVVRKPSEKGM 

QQKKKTKDLGFRAGKESKTEWRK*GLQDMASQ 

MFALPLK*PVTAAFHDSSMPSSLLQIEMEQLFLE 

ARLQ/PDSKSEARRNQCDSMLLRNQQLCSTCQE 

MKMVQPRTMKEPDDPKASFENCMSYRMSLHQP 

KFQTTPEPFHDDIPTENIHLQNL/PILGPRTAVFHG 

LLTEAYKTLKERQRSSLPRKEPIGKTTEAVSGRSS 

SPPRLPERK 


3858 


A 


203 


3469 


SHQEIEQNSAMAPRKRGGRGISFIFCCFRNNDHPE 
ITYRLRNDSNFALQTMEPALPMPPVEELDVMFSE 
LVDELDLTDKHREAMFALPAEKKWQIYCSKKK 

EEEEERSKTIESLKTALRTKPMRFVTRFIDLDGLS 
CILNFLKTMDYETSESRfflTSLIGCIKALMIWSQG 
RAIWLAHSESINV1AQSLSTENIKTKVAVLEILGA 
VCLWGGHKKVLQAMLHYQKYASERTRFQTLIN 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to First amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine,C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=PhenyIaIanine, G=Glycine, H=Histidine, 
I^Isoleucine, K=Lysine, L=Lcucine, M=Methionine, 
N-Asparagine, P«Proline, Q~GIutaminc, R=Arginine, S-Serine, 
T^Threonine, V=Va)ine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, ^Stop cod on, /^possible nucleotide deletion, 
^possible nucleotide insertion 










DLDKSTGRYRDE V SLKTAIMSFINA VLSQG AG VE 

SLDFRLHLRYE\FLMLGIHPVMDBCLRKHENSTLD 

RrlLDEFENn^RKEDELEFAKRFELVHIDTKSATQM 

FELTRXRLTHSEAYPHFMSILHHCLQMPYKRSGN 

TVQYWLLLDR1IQQIVIQNDKGQDPDSTPLENFNT 

KNVVRMLVNENEVKQW<JEQAEKMRKEHNELQ 

QKLEKKERECDAKTQEKEEMMQTLNKMKEKLE 

KETTEHKQVKQQVADLTAQLHELSRRAVCASIP 

GGPSPGAPGGPFPSSVPGSLLPPPPPPPLPGGMLPP 

PPPPLPPGGPPPPPGPPPLGAIMPPPGAPMGLALK 

KKSIPQPTNALKSFNWSKLPENKLEGTVWTEIDD 

TKVFKILDLEDLERTFSAYQRQQDFFVNSNSKQK 

EADAIDDTLSSKLKVKELSVIDGRRAQNCNILLS 

RLKLSNDEIKRAILTMDEQEDLPKDMLEQLLKFV 

PEKSDIDLLEEHKHELDRMAKADRFLFEMSRINH 

YQQRLQSLYFKKKFAERVAEVKPKVEAIRSGSEE 

VFRSGALKQLLEVVLAFGNYMNKGQRGNAYGF 

KISSLNKIADTKSSIDKNITLLHYLITIVENKYPSV 

LNLNEELRDIPQAAKWMTELDKEISTLRSGLKA 

VETELEYQKSQPPQPGDKFVSVVSQFITVASFSFS 

DVEDLLAEAKDLFTKAVKHFGEEAGKIQPDEFF 

GIFDQFLQAVSEAKQENENMRKKKEEEERRARM 

EAQLKEQRERERKMRKAKENSEESGEFDDLVSA 

LRSGEVFDKDLSKLKRNRKRITNQMTDSSRERPI 

TKLNF 


3859 


A 


1279 


141 


RVEHLSEFLVDIKPSLTFDVIPLLDPYGPAGSDPS . 
LEFLVVSEETYRGGMAINRFRLENDLEELALYQI 
QLLKDLRHTENEEDKVSSSSFRQRMLGNLLRPPY 
ERPELPTCLYVIGLTGISGSGKSS1AQRLKGLGAF 
VBDSDHLGHRAYAPGGPAYQPVVEAFGTDILHK 
DGIINRXVLGSRWGNKXQLKILTDMWPIL^KLA 
. REEMDRAVAEGKRVCyiDAAVLLEAG WQNLVH 
EVWTAVIPETEAVRRiVERDGLSEAAAQSRLQSQ 
MSGQQLVEQSHVVLS'RCGSRISPNARWRKPGPS 
CRSAFPRLIRPSTEKFSVGPDWLLELTSDPWRRN 
GGLDAHPGSGPEVQAILCRTWPGLVDTGSLPNTL 
VFGQH 


3860 


A- 


1 . 


3881 


MGQKSVGASYVQIPLVPPLSRHPKGLGHEDRWS 

SYCLSSLAAQNICTSKLHCPAAPEHTDPSEPRGSV 

SCCSLLRGLSSGWSSPLLPAPVCNPNKAIFTVDA 

KTTEELVANDKACGLLGYSSQDLTGQKLTQFFLR 

SDSDWEALSEEHMEADGHAAVVFGTVVDIISRS 

GEKIPVSVWMKRMRQERRLCCVVVLEPVERVST . 

WVAFQSDGTVTSCDSLFAHLHGYVSGEDVAGQ 

HITDLIPSVQLPPSGQHIPKNLKIQRSVGRARDGT 

TFPLSLKLKSQPSSEEATTGEAAPVSGYRASVWV 

FCTISGLITLLPDGTIHGINHSFALTLFGYGKTELL 

GKN1TFLIPGFYSYMDLAYNSSLQLPDLASCLDV 

GNESGCGERTLDPWQGQDPAEGGQDPRINVVLA 

GGHVVPRDEIRKLMESQDIFTGTQTELIAGGQLL 

^PT ^pnpAPnvn>A/P'pn9T pv wren at PTf'nnrn'T 

ALGREEPVAffiSPGQDLLGESRSEPVDVKPFASCE 
DSEAPVPAEDGGSDAGMCGLCQKAQLERMGVS 
GPSGSDLWAGAAVAKPQAKGQLAGGSLLMHCP 
CYGSEWGLWWRSQDLAPSPSGMAGLSFGTPTLD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence {A-Alaninc OCysteinc, D=Aspartic Acid, 
E^Glutamic Acid, F=PhenylaIanine, (^Glycine, H=Histidine, 
I=Isoleucinc, K=Lysine, L?=Leucine, M=Methioninc, 
N=Asparagine, P^ProIine, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V^Valine, W^Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










EPWLGVENDREELQTCLIKEQLSQLSLAGALDVP 

HAELVPTECQAVTAPVSSCDLGGRDLCGGCTGS 

SSACYALATDLPGGLEAVEAQEVDVNSFSWNLK 

ELFFSDQTDQTSSNCSCATSELRETPSSLAVGSDP 

DVGSLQEQGSCVLDDRELLLLTGTCVDLGQGRR 

FRESCVGHDPTEPLEVCLVSSEHYAASDRESPGH 

\0>STLDAGPEDTCPSAEEPRLNVQVTSTPVIVMR 

GAAGLQREIQEGAYSGSCYHRDGLRLSIQFEVRR 

VELQGPTPLFCCWLVKDLLHSQRDSAARTRLFL 

ASLPGSTHSTAAELTGPSLVEVLRARPWFEEPPK 

AVELEGLAACEGEYSQKYSTMSPLGSGAFGFV W 

TAVDKEKNKEVWKFIKKEKVLEDCWIEDPKLG 

KVTLEIA1LSRVEHANIIKVLDIFENQGFFQLVME 

KHGSGLDLFAFIDRHPRLDEPLASYIFRQVRAG\Q 

SRLVSAVGYLRLKDIIH31D1KDENIV1AEDFTIKLI 

DFGSAAYLERGKLFYTFCGTIEYCAPEVLMGNPY . 

RftPFT FTVyfW^T OVTT VTT VFFFXTPPr'FT PPTVPA A 
A.VJrJ&JUJQJVl W »M-/Vj V 1LI iLt V r IZClSrr^nLelZCr 1 VZlA/i 

IHPPYLVSKELMSLVSGLLQPVPERRTTLEKLVT 
DPWVTQPVNLADYTWEEVFRVNKPESGVLSAAS 
LEMGNRSLSDVAQAQELCGGPVPGEAPNGQGCL 
HPGDPRLLTS 


3861 


A 


1 


3881 

Cr - 

1 


MGQKSVGASYVQIPLVPPLSRHPKGLGHEDRWS 

SYCLSSLAAQNICTSKLHCPAAPEHTDPSEPRGSV 

SCCSLLRGLSSG WSSPLLPAPVCNPNKAIFTVDA 

KTTEILVANDKACGLLGYSSQDLIGQKLTQFFLR 

SDSDWEALSEEHMEADGHAAWFGTWDIISRS 

GEKIPVSVWMKRMRQERRLCCVVVLEPVERVST 

WVAFQSDGTVTSCDSLFAHLHGYVSGEDVAGQ 

HITDLIPSVQLPPSGQHIPKNLKIQRSVGRARDGT 

TFPLSLKLKSQPSSEEATTGEAAPVSGYRASVWV 

FCTISGLITLLPDGTIHGINHSFALTLFGYGKTELL 

GKNITFLPGFYSYMDLAYNSSLQLPDLASCLDV 

GNESGCGERTLDPWQGQDPAEGGQDPRINVVLA 

GGHWPRDEIRKLMESQDIFTGTQTELIAGGQLL . 

SCLSPQPAPGVDNVPEGSLPVHGEQALPKDQQIT 

ALGREEPVAffiSPGQDLLGESRSEPVDVKPFASCE 

DSEAPVPAEDGGSDAGMCGLCQKAQLERMGVS 

GPSGSDLWAGAAVAKPQAKGQLAGGSLLMHCP 

CYGSEWGLWWRSQDLAPSPSGMAGLSFGTPTLD 

EPWLGVENDREELQTCLIKEQLSQLSLAGALDVP 

HAELVPTECQAVTAPVSSCDLGGRDLCGGCTGS 

SSACYALATDLPGGLEAVEAQEVDVNSFSWNLK 

ELFFSDQTDQTSSNCSCATSELRETPSSLAVGSDP 

DVGSLQEQGSCVLDDRELLLLTGTCVDLGQGRR 

FRESCVGHDPTEPLEVCLVSSEHYAASDRESPGH 

WST1JDAGPEDTCPSAEEPRLNVQVTSTPVIVMR 

GAAGLQREIQEGAYSGSCYHRDGLRLSIQFEVRR 

VELQGPTPLFCCWLVKDLLHSQRDSAARTRLFL . 

ASLPGSTHSTAAELTGPSLVEVLRARPWFEEPPK 

AVELEGLAACEGEYSQKYSTMSPLGSGAFGFVW 

TAVDKEKNKEVWKFIKJK£KVLEDCWIEDPKLG 

KVTLEIA1LSRVEHANIIKVLDIFENQGFFQLVME 

KHGSGLDLFAFIDRHPRLDEPLASYIFRQVRAGVQ 

SRLVSAVGYLRLKDIIHRDIKDENrVIAEDFTIKLI 

DFGSAAYLERGKLFYTFCGTEEYCAPEVLMGNPY 
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SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of. 
peptide 
sequence 


Amino acid sequence (A=Alaninc C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H~Histidine, 
I=lsoIeucine, K~Lysinc, L=Leucine, M=Methionine, 
N-Asparagine, P=Proline, Q^GItitamine, R^Arginine, S=Serine, 
T^Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X=l)nkno\vn, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 










Pf"lDT?T "CTv/TYl/CT n\/TT VTT \7T?t?t7XrDCr , 'CT CrT\rc A A 

KurJtlL-JfcMWol^OV IJu I ll^VrfcllJNrrLJlLcbl VLAA 

IHPPYLVSKELMSLVSGLLQPWERRTTLEKLVT 

DPWVTQPVNLADYTWEEVFRVNKPESGVLSAAS 

LEMGNRSLSDVAQAQELCGGPVPGEAPNGQGCL 

HPGDPRLLTS 


3862 


A 


399 


2069 


TMDRSKRNSIAGFPPRVE\RLEEFEGGGGGEGNV 

SQVGRVWPSSYRALISAFFRLTRLDDFTCEKIGSG 

FFSEVFKVRHRASGQVMALKMNTLSSNRANML 

KEVQLMNRLSHPNILRYINSGNLEQLLDSNLHLP 

WTVRVKLAYDLAVGLSYLHFKGDFHRDLTSKNC 

LIKRDENG YS A V V ADFGLAEKIPDV SMGSEKLA 

VVGSPFWMAPEVLRDEPYNEKADVFSYGIILCEII 

ARIQADPDYLPRTENFGLDYDAFQHMVGDCPPD 

FLQLTFNCCNMDPKLRPSFVEIGKTLEEILSRLQE 

EEQERDRKLQPTARGLLEKAPGVKRLSSLDDKIP 

HKSPCPRRTIWLSRSQSDIFSRKPPRTVSVLDPYY 

RPRDGAARTPKVNPFSARQDLMGGIOKFFDLPSK 

o VlbL vr DLDArur Cj 1 MPLAD WQEPLAPPIRRWR 

SLPGSPEFLHQEACPFVGREESLSDGPPPRLSSLK 

YRVKEIPPFRASALPAAQAHEAMDGSILQEENGF 

GSRPQGTSPCPAGASEEMEVEERPAGSTPATFSTS 

GIGLQTQGKQDG 


3863 


A 


399 


2069 


TMDRSKRNSIAGFPPRVEVRLEEFEGGGGGEGNV 

SQVGRVWPSSYRALISAFFRLTRLDDFTCEKIGSG 

FFSEVFKVRHRASGQVMALKMNTLSSNRANML 

KEVQLMNRLSHPNILRYINSGNLEQLLDSNLHLP 

WTVRVKLAYDIAVGLSYLHFKGIFHRDLTSKNC 

LIKRDENGYSAWADFGLAEKIPDVSMGSEKLA 

VVGSPFWMAPEVLRDEPYNEKADVFSYGIILCEII 

ARJQADPD YLPRTENFGLD YD AFQHMVGDCPPD 

FLQLTFNCCNMDPKLRPSFVEIGKTLEEILSRLQE 

EEQERDRKLQPTARGLLEKAPGVKRLSSLDDKIP 

HKSPCPRRTIWLSRSQSDIFSRKPPRTVSVLDPYY 

RPRDGAARTPKVNPFSARQDLMGGKIKFFDLPSK 

o VloL VrULUArOru lMPLAD WQEPLAPPIRRWR 

SLPGSPEFLHQEACPFVGREESLSDGPPPRLSSLK 

YRVKEIPPFRASALPAAQAHEAMDCSILQEENGF 

GSRPQGTSPCPAGASEEMEVEERPAGSTPATFSTS 

GIGLQTQGKQDG 


3864 


A 


3 - 


911 


SWNMDSDSCAAAFHPEEYSPSCKRRRTVEDFNK 

FCTFVLAYAGYIPYPKEELPLRSSPSPANSTAGTI 

DSDGWDAGFSDIASSVPLPVSDRCFSHLQPTLLQ 

RAKPSNFLLDRKKTDKLKKJ<K^ 

Jdu i KUUJLri^JsJLJbAAJJr Y V 1 1 r 1 or I LQDlr Q Ar i>D 

PCSGWDSDTPSSGSCATVSPDQVKEIKTEGKRTI 

VR/QEAQLMARNDGNFSSLLESIFPS\DDDSWDLV 

TCFCMKPFAGRPMIECNECHTWIHLSCAKIRKSN 

VPEVFVCQKCRDSKFDIRRSNRSRTGSRICLFLD 


3865 . 


A 


3 


3573 


QERLRSRSRPDRAAREAGSARGRQPKRTERVEQ 
FLTIARRRGRRSMPVSLEDSGEPTSCPATDAETAS 

DDHDDTSDSDSDGLTLKELQNRLRRKREQEPTE 
RPLKGIQSRLRKKRREEGPAETVGSEASDTVEGV 
LPSKQEPENDQGWSQAGKDDRESKLEGKAAQD 
IKDEEPGDLGRPKPECEGYDPNALYCICRQPHNN 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
seqoence 


Amino acid sequence (A^AIanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=GIycine, H^Histidine, 
I=lsoleucinc, K=Lysine, L^Leucine, M=Methioninc, 
N=Asparagine, P=Proline, Q=Glu famine, R^Argimne, S=Serine, 
"^Threonine, V«Valine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop cod on, /-possible nucleotide deletion, 
Wpossible nucleotide insertion 










RFMICCDRCEEWFHGDCVGISEARGRLLERNGE 
DY1CPNCTDLQVQDETHSETADQQEAKWRPGDA 
DGTDCTSIGTTEQKSSEDQGIKGRIEKAANPSGKK * 
KLKIFQPGPGPVPTQLPVLWQVLEIAVSRSISAFT 
LLHCISCKVIEAPGASKCIGPGCCHVAQPDSVYCS 
NDCILKHAAATMKFLS S GKEQKPKPKEKMKMK 
PEKPSLPKCGAQAGn<QSSVHKRPAPEKKETTVK 
KAWVPARSEALGKEAACESSTPSWASDHNYNA 
VKPEKTAAPSPSLLYKSTKEDRRSEEKAAATAAS 
KKTAPPGSTVGKQPAPR2sTLVPKKSSFANVAAAT 
PAKKPPSGFKGTIPKRPWLSATPSSGASAARQAG 
PAPAAATAASKKFPGSAAL VGA VRKPVVPSVPM 
ASPAPGRLGAMSAAPSQPNSQIRQNIRRSLKEIL 
WK/RFLFFILFRVNDSDDLIMTENEVGKIALHIEK 
EMFNLFQVTDN/RAYKSKYRSIMFNLKDPKNQG 
LFHRVLREEISLAKLVRLKPEELVSKELSTWKER 
PARSVMESRTKLHNESKKTAPRQEAIPDLEDSPP 
VSDSEEQQESARAVPEKSTAPLLDVFSSMLKDTT 
SQHRAHLFDLNCKICTGQVPSAEDEPAPKKQKLS 
ASVKKEDLKSKHDSSAPDPAPDSADEVMPEAVP 
EVAiSEPGLESASHPNVDRTYFPGPPGDGHPEPSPL 
EDLSPCPASCGSGVVTTVTVSGRDPRTAPSSSCT 
AVASAASRPDSTHMVEARQDVPKPVLTSVMVPK 
SILAKPSSSPDPRYLSVPPSPNISTSESRSPPEGDTT 
LFLSRLSTIWKGFINMQSVAKFVTKAYPVSGCFD 
; YLSEDLPDTIHIGGRIAPKTVWDYVGKLKSSVSK 
ELCL1RFHPATEEEEVAYISLYSYFSSRGRFG VVA 
hMNniHVKDLYLIPLSAQDPVPSKLLPFEGPGKRR 
LSGWR 


3866 


A 


2 


3181 


AQQPVGRRGGASGAGGGRRGTPRPRAGAGPGF 

QVSSGGCRLSKMRRFLRPGHDPVRERLKRDLFQ ' 

FNKTVEHGFPHQPSALGYSPSLRJLAIGTRSGAIK 

LYGAPGVEFMGLHQENNAVTQMLLPGQCQLVT 

LLDDNSLHLWSLKVKGGASELQEDESFTLRGPP 

GAAPSATQiTVVLPHSSCELLYLGTESGNVFVVQ 

LPAFRALEDRTISSDAVLQRLPEEARHRRVFEMV 

EALQEHPRDPNQILIGYSRGLVVIWDLQGSRVLY 

HFLSSQQLENIWWQRDGRLLVSCHSDGS YCQWP 

VSSEAQQPEPLRSLVPYGPFPCKAITRILWLTTRQ 

G\LPFTIFQGGMPRASYGDRHCISVIHDGQQTAFD 

FTSRVIGFTVLTEADPAATFDDPYALWLAEEEL 

WIDLQTAGWPPVQLPYLASLHCSAITCSHHVSN 

IPLICLWERI1AAGSRQNAHFSTMEWPIDGGTSLTP 

APPQRDLLLTGHEDGTVRFWDASGVCLRLLYKL 

STVRVFLTDTDPNENLSAQGEDEWPPLRKVGSF 

DPYSDDPRLGIQKIFLCKYSGYLAVAGTAGQVLV 

LELNDEAAEQAVEQVEADLLQDQEGYRWKGHE 

RLAARSGPVRFEPGFQPFVLVQCQPPAVVTSLAL 

HSE WRLVAFGTSHGFGLFDHQQRRQVF VKCTLH 

PSDQLALEGPLSRVKSLKKSLRQSFRRMRRSRVS 

CD V"D "LTD A ^ITDT^/IT? A /^iT7/~> CAVA XTTt Tk/^T /~VKTK JTCf A T>T 7 

61<J<a<J±r7lur r<jr Jb A QJbG a A1CAEKPGLQNMEL AP V 

QRKJEARSAEDSFTGFVRTLYFADTYLKDSSRHC 

PSLWAGTNGGTIYAFSLRVPPAERRMDEPVRAE 

QAKE1QLMHRAPVVGELVLDGHSVPLPEPLEVAH 

DLSKSPDMQGSHQLLVVSEEQFKVFTLPKVSAK 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc C=Cystcine, D=Aspartic Acid, 
E=GIutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucinc, K=Lysine, L=Leucinc, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W«Tryptophan, Y=Tyrosine, 
X«Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










LKLKLTALEGSRVRRVSVAHFGSRRAEDYGEHH 

LAVLTNLGDIQWSLPLLKPQVRYSCIRREDVSGI 

ASCVFTKYGQGFYLISPSEFERFSLSTKGVLVEPRC 

LVDSAETKNHRPGNGAGPKKAPSRARNSGTQSD - 

GEEKQPGLVMERALLSDERAATGWHIEPPWGA 

ASAMAEQSEWLSVQAAR 


3867 


A 


2 


3181 


AQQPVGRRGGASGAGGGRRGTPRPRAGAGPGF 
QVSSGGCRLSKMRRFLRPGHDPVRERLKRDLFQ 
FNKTVEHGFPHQPSALGYSPSLRILAJGTRSGAIK 
LYGAPGVEFMGLHQENNAVTQIHLLPGQCQLVT 
LLDDNSLHLWSLKVKGGASELQEDESFTLRGPP 
GAAPSATQITVVLPHSSCELLYLGTESGNVFVVQ 
LPAFRALEDRTISSDAVLQRLPEEARHRRVFEMV 
EALQEHPRDPNQILIGYSRGLVVI WDLQGSRVLY 
HFLSSQQLENIWWQRDGRLLVSCHSDGSYCQWP 
VSSEAQQPEPLRSLVPYGPFPCKAITRILWLTTRQ 
G\LPFTIFQGGMPRASYGDRHCISVIHDGQQTAFD 
FTSRVIGFTVLTEADPAATFDDPYALVVLAEEEL 
WIDLQTAGWPPVQLPYLASLHCSAITCSHHVSN 
IPLKLWERIIAAGSRQNAHFSTMEWPIDGGTSLTP 
APPQRDLLLTGHEDGTVRFWDASGVCLRLLYKL 
* STVRVFLTDTDPNENLSAQGEDEWPPLRKVGSF 
DPYSDDPRLGIQKEFLCKYSGYLAVAGTAGQVLV 
LELNDEAAEQAVEQVEADLLQDQEGYRWKGHE 
RLAARSGPVRFEPGFQPFVLVQCQPPAWTSLAL 
HSEWRLVAFGTSHGFGLFDHQQRRQVFVKCTLH 
PSDQLALEGPLSRVKSLKKSLRQSFRRMRRSRVS 
SRKRHPAGPPGEAQEGSAKAERPGLQNMELAPV 
QRKIEARSAEDSFTGFVRTLYFADTYLKDSSRHC 
PSLWAGTOGGTIYAFSLRVPPAERRMDEPVRAE 
QAKEIQLMHRAPWGILVLDGHSVPLPEPLEVAH 
DLSKSPDMQGSHQLLVVSEEQFKVFTLPKVSAK 
LKLKLTALEGSRVRRV S V AHFG SRRAED YGEHH ' 
LAVLTNLGDIQVVSLPLLKPQVRYSCIKREDVSGI 
ASCVFTKYGQGFYLISPSEFERFSLSTKG\LVEPRC 
LVDSAETKNHRPGNGAGPKKAPSRARNSGTQSD 
GEEKQPGLVMERALLSDERAATGWHIEPPWGA 
ASAMAEQSEWLSVQAAR 


3868 


A 


1 


2497 


GDSGGPLVCEEPSGRFFLAGIVSWGIGCAEARRP 

GVYARVTRLRDW1LEATTKASMPLAPTMAPAPA . 

APSTAWPTSPESPVVSTPTKSMQALSTVPLDWVT 

VPKLQECGARPAMEKPTRWGGFGAASGEVPW 

QVSLKEGSRHFCGATWGDRWLLSAAHCFNHT 

KVEQVRAHLGTASLLGLGGSPVKIGLRRWLHP 

LYNPGBLDFDLAVLELASPLAFNKYIQPVCLPLAI 

QKFPVGRKCMISGWGNTQEGNATKPELLQKASV 

GIIDQKTCSVLYNFSLTDRMICAGFLEGKVDSCQ 

VSGIKALYESELADARRVLDETARERARLQIEIG 

KLRAELDEVNKSAKKREGELTVAQGRVKDLESL 

FHRSEVELAAALSDKRGLESDVAELRAQLAKAE 

DGHAVAKKQLEKETLMRVDLENRCQSLQEELDF 

RKSVFEEEVRETRRRHERRLVEVDSSRQQEYDFK 

MAQALEELRSQHDEQVRLYKLELEQTYQAICLDS 

AKLSSDQNDKAASAAREELKEARMRLESLSYQL 

SGLQKQASAAEDRIRELEEAMAGERDKFRKMLD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc CXTystcinc, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G==Glycine, H=Histidine> 
I~lsoleucinc, K=Lysine, L=Leucine, {^Methionine, 
N=Asparagine, P=Proline, Q=Glutaroine, R*=Arginine, S^Serine, 
T^Threonine, V-Valine, W=Tryptophan, Y=Tyrosinc, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possiblc nucleotide insertion 










AKEQEMTE^4RDVMQQQLAEYQELLDVKLALD 

MEINAYRKLLEGEEERLKLSPSPSSRVTVSRATSS 

SSGSLSATGRLGRSKRKR\WRWRSPW\QRPKRPG 

HGHGWQRWLPPGPAGLGLGQRVHIEEIDLEGKFV 

V^LKNN bDKJJQbLCjN WKXI^QVLEvjEEIAYKFTP 

KYE.RAGQMVTVWAAGAGVAHSPPSTLVWKGQ 

SSWGTGESFRTVLVNADGEEVAMRTVKKSSVM 

RENENGEEEEEEAEFGEEDLFHQQGDPRTTSRGC 

YVM 


3869 


A 


! 


1942 


RYRAGPGDGRKDYIRLTRPGLTLPGRAMFARGS 

RRRRSGRAPPEAEDPDRGQPCNSCREQCPGFLLH 

GWRKICQHCKCPREEHAVHAVPVDLER1MCRLIS 

DFQRHSISDDDSGCASEEYAWVPPGLKPEQVYQ 

FFSCLPEDKWYVNSPGEKYRIKQLLHQLPPHDS 

EAQYCTAL\EE\EEKKELRAFSQQRKRENLG/RLG 

IVRIFPVTI'RGAI\CEECGKQIGGGDIAVF\ASRASL 

GLLLGQPSCFWCTTCQELLVDLIYFYHVGKVYC 

GRHHAECLRPRCQACDEIIFSPECTEAEGRHWHM 

DHFCCFECEASLGGQRYVMRQSRPHCCACYEAR 

HAEYCDGCGEHIGLDQGQMAYEGQHWHASDRC 

FCCSRCGRALLGRPFLPRRGLIFCSRACSLGSEPT 

APGPSRRSWSAGPVTAPLAASTASFSAVKGASET 

TTKGTSTELAPATGPEEPSRFLRGAPHRHSMPEL 

GLRSVPEPPPESPGQPNLRPDDSAFGRQSTPRVSF 

KDPL V bbuurKK TLS APP AQRRRPRSPPPRAPSRR 

RHHHHNHHHHHNRHPSRRRHYQCDAGSGSDSE 

SCSSSPSSSSSESSEDDGFFLGERIPLPPHLCRPMP 

AQDTAMETFNSPSLSLPRDSRAGMPRQARDKNC 

IVA < 


3870 


A 


2 


3485 


FVWRVFYVHASCMPPRARSWEGAHAPVGMHV 

AEAHACSSQQQQMPPAQFWMLEWLLHLCAFLS 

TPSFPHWCCCSNPHGSIADKPEEIVPASKPSRAAE 

NMAVEPRVATIKQRPSSRCFPAGSDMNSVYERQ 

GIAVMTPTVPGSPKAPFLGIPRGTMRRQKSIDSRI 

FLSGITEEERQFLAPPMLKFTRSLSMPDTSEDIPPP 

PQSVPPSPPPPSPTTYNCPKSPTPRVYGTDCPAFNQ 

NSAAKVSPATRSDTVATMMREKGMYFRRELDR 

YSLDSEDLYSRNAGPQANFRNKRGQMPENPYSE 

VGKIA SKA VY VP AKP ARRKG ML VKQSN VED SPE 

KTCSIPIPTIIVKEPSTSSSGKSSQGSSMEIDPQAPE 

PPSQLRPDESLTVSSPFAAAIAGA VRDREKRLEA 

RRNSPAFLSADLGDEHVGLGPPAPRTRPSMFPEE 

GDFADEDSAEQLSSPMPSATPREPENHFVGGAEA 

SAPGEAGRPLNSTSKAQGPESSPAVPSASSGTAG 

PGNYVHPLTGRLLDPSSPLALALSARDRAMKES 

QQGPKGEAPKADLNKPLYIDTKMRPSLDAGFPT 

VTRQNTRGPLRRQETENKYETDLGRDRKGDDK 

KNMLIDIMDTSQQKSAGLLMVHTVDATKLDNA 

LQEEDEKAEVEMKPDSSPSEVPEGVSETEGALQI 

S AAPEPTTVPGRTTV A VG SMEE A VILPFREPPPPL A 

SVDT DEDFTFTFPT PPPT FFAKKFTYTPnTYR A A QVPA 

LSDLVKQKKSDTPQSPSLNSSQPTT^SADSICKPAS 
LSNCLPASFLPPPESFDAVADSGIEEVDSRSSSDH 
HLETTSnSTVSSISTLSSEGGENVDTCTVYADGQ 
AFMVDKPPVPPKPKA1KPIIHKSNALYQDALVEE 
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SEQJD 
NO: 


Method 


Predicted 
beginning 
nucleotide 

corresponding 
to first amino 
add residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Pnenylalanine, G=Glycine, H=Histidine, 
I=Isolc urine, K^Lysine, l/=Leucinc, M=Methionine, 
N=Asparagine, P=Proline, Q^Glutamine, R^Argininc, S^Scrine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=XJnknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










DVDSFVIPPPAPPPPPGSAQPGMAKVLQPRTSKL 

WGDVTEIKSPILSGPKANVISELNSILQQMNREKL 

AKPGEGLDSPMGAKSASLAPRSPEIMSTISGTRST 

TVTFTVRPGTSQPITLQSRPPDYESRTSGTRRAPS 
PWSPTFMNfcTFTT PAPT ^AATA<3PQPAT <3nVT?CT P 

SQPPSGDLFGLNPAGRSRSPSPSILQQPISNKPFTT 
KPVHLWmPDVADWLESLNLGEHKEAFMDNEI 
DGSHLPNLQKEDLEDLGVTRVGHRMNIERALKQ 
LLDR 


3871 


A 


35 


1171 


VESRSAWHEGEDQIDRLDFIRNQMNLLTLDVKK 

KIKEVTEEVANKVSCAMTDEICRLSVLVDEFCSE 

FHPNPDVLKIYKSELNKHIEDGMGRNLADRCTD 

EVNALVLQTQQEI1ENLKPLLPAGIQDKLHTL1PC 

KKFDLSYNLNYHKLCSDFQEDIVFRFSLGWSSLV 

HRFLGPRNAQRVLLGLSEPIFQLPRSLASTPTAPT 

TPATPDNASQEELMTTLVTGLASVTSRTSMGIIIV 

GGVTWKTTfiWKT T ^V^T TTWTVGAT VT VRPT QWTT 

HAKERAFKQQFVNYATEKLRMIVSSTSANCSHQ 
VKQQIATTFARLCQQVDITQKQLEEEIARLPKEID 
QLEKIQNNSKLLRNKAVQLENELENFTKQFLPSS 
NEES 


3872 


A 




1171 


VESRSAWHEGEDQIDRLDFIRNQMNLLTLDVKK 

KIKEVTEEVANKVSCAMTDEICRLSVLVDEFCSE 

FHPNPDVLKJYKSELNKHIEDGMGRNLADRCTD 

EVNALVLQTQQEIIENLKPLLPAGIQDKLHTLIPC 

KICTDLSYNLNYHKLCSDFQEDfVFRFSLGWSSLV 

HRFLGPRNAQRVLLGLSEPIFQLPRSLASTPTAPT 

TPATPDNASQEELMITLVTGLASVTSRTSMGIIIV 

GGVTWKTTGWK'T T <sV^T TMVG AT VT VPP T CWTT 

HAKERAFKQQFVNYATEKLRMIVSSTSANCSHQ 
VKQQIATTFARLCQQVDITQKQLEEEIARLPKEED 
. QLEKIQNNSKLLRNICAVQLENELENFTKQFLPSS 
NEES 


3873 


A 


2944 


2089 


PVCTALTPGRMTDDKDVLRDVWFGR1PTCFTLY 
QDEITEREAEPYYLLLPRVSYLTLVTDKVKI<CHFQ 
KVMRQEDISEIWFEYEGTPLKWHYPIGLLFDLLA 
SSSALPWNITVHFKSFPEKDLLHCPSICDAIEAHF 
MSCMKFADAT KTTK ^OVT>JFMnT<f TTnT-Tk'rn w/Tur; 

LQNDRFDQFWAINRKLMEYPAEENGFRY1PFRIY 
QTTTERPFIQKLFRPVAADGQLHTLGDLLKEVCP 
SAmPEDGEKKNQVMIHGIEPMLETPLQWLSEHL 
SYPDNFLHISIIPQPTD - 


3874 


A 


776 . 


366 


OARGAP^s^PMPPT PT A A A AVA APP APT PT T TJPr; 

LAAAMSTAQSLKSVDYEVFGRVQGVCFRMYTE 
DEARKIGWGWVKNTSKGTVTGQVQGPEDKVN 
SMKSWCSKVGSPSSRIDRTNFSNEKTISKLEYSNF 
SIRY 


3875 


A 


1081 


182 


SLSSCQTDPRPMSAPLDAALHALQEEQARLKMR 

LWDLQQLRKELGDSPKDKVPFSVPKIPLVFRGHT 

QQDPEVPKSLVSNLRIHCPLLAGSALITFDDPKVA 

EQVLQQKEHUNMEECRLRVQVQPLELPMVTTIQ 

VMVSSQLSGRRVLVTGFPASLRLSEEELLDKLE1F 

FGKTRNGGGDVDVRELLPGSVMLGFARDGVAQ 

RLCQIGQFTVPLGGQQVPLRVSPYVNGEIQKAEI 

RSQPVPRSVLVLNIPDILDGPELHDVLEIHFQKPT 
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SEQID 
NO: 


Method 

• 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corres po n d i ng 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C=Cysteine, D=Aspartic Acid, 
£=GIutamic Acid, F=Phenylalanine, G=Glycine, H^Histidine, 
I=lsoleucine, K=*Lysine, L=Leucine, M^Methionine, 
N=Asparaginc, P^Proline, Q=Glutaminc, R=Argininc, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y-Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










RGGG EVEALTV VPOGOOGLAVFT^F 


3876 


A 


26 


431 


RMMKCPQALLAIFWLLLSWVSSEDKWQSPLSL 
VVHEGDTVTLNCSYEVTNFRSLLWYKQEKKAPT 
FLFMLT^SGIEKKSGRLSSILDKKELSSILNITATQ 
TGDSAIYLCAVEAQCSLVTCSLYSNSTAEALQL 


3877 

r 


A 


3 


1291 


KAFRLLAERGAAAAMLWSGCRRFGARLGCLPG 

GLRVLVQTGHRSLTSCIDPSMGLNEEQKEFQKV 

AFDFAAREMAPNMAEWDQKELFPVDVMRKAA 

QLGFGGVYIQTDVGGSGLSRLDTSVIFEALATGC 

TSTTAYISIHNMCAWMID^^ 

TMEKFASYCLTEPGSGSDAASLLTSAKKQGDHY1 

LNGSKAFISGAGESDIYVVMCRTGGPGPKGISCIV 

VEKGTPGLSFGKKEKKVGWNSQPTRAVIFEDCA 

AHA S VILTRDHLNVRKQFGEPL ASNQ YLQFTL A 
DMATRLVAARLMVRNAAVALQEERKDAVALCS 
MAKLFATDECFAICNQALQMHGGYGYLKDYAV 
QQYVRDSRVHQILEGSNEVMRJLISRSLLQE 


3878 


A 


10 


1014 


LPGSTISSSGCQAPGRADSSGGARNSRRGDSRPG 
SCNRQAVAPPCPSPGPQSRHWIHRGTAPQAGETR 
TLGRGSSAPNACSASVTPCCPSSPPS*SCL*PTRRS . 
PQNSSSTEVYRGFWQHGLPST**PFSS*QWPGQH 
TQGCSKLLGKQTTHLPCSTWPA**PSPSCLTRFR* 

W*PQT KAPT \X/AQQPCVrV*QDQr:CPTJU*T.'omTUOT 
w roi-rJYlL^J-. W/\joLo V V oi^ouoUKxl -L WO X xio 1 

SRTC*ARRSSALPTGLCTDDTSWASSSKARPCAL 
QRPSSLSSLSPCLTC*W*LSSSSPMSARSPAGAET 
GSWATGSPRLTQWKSSRLTSTSHSARSAWKPSA 
TESTPSWPRFSSWTSGEDPASPAPAI 


3879 


A 


200 


fiQQ 


NTSLCTRDYKITQVLFPLLYTVLFFVGLITNGLA 

DAKLGTGPLRTFVCQVTSVIFYFTMYIS1SFLGLIT 
IDRYQKTTRPFKTSNP3CNLLGAKILK 


3880 


A 


26 


169 . 


QPETDTMVHLTPEEKSAVTALWGKVNVDEDAG 
DDLCQILVDRPRLRI 


3881 


A 


37 


1100 


TPLFDFWPGFVLSWLQPLSASLRARRAASGPPAC 
RJMPTTVDDVLEHGGEFHFFQKQMFFLLALLSAT 
FAPIYVGIWLGFTPDHRCRSPGVAELSLRCGWSP 
AEELNYTVPGPGPAGEASPRQCRRYEVDWNQST 
FDCVDPLASLDTNRSRLPLGPCRDGWVYETPGSS 
IVTEFNLVCANSWMLDLFQSSVNVGFFIGSMSIG 

YTAnP.FnP'K'T PT T '1 TUT TWA A AfiVT UATQPTVTW 

MLIFRLIQGLVSKAGWLIGYILITEFVGRRYRRTV 
GIFYQVAYTVGLLVLAGVAYALPHWRWLQFTV 
ALPNFFFLLYYWCIPESPRWLISQNKNAEAMRIIK 
fflAKKNGKSLPASL 


3882 


A 


573 


1620 


KSKCRFPEGLSEGFGPMRKEALSSGSVQEAEAM 

LDEPQEQAEGSLTVYVISEHSSLLPQDMMSY1GP 

KRTAVVRGIMHREAFOT1GRRIVQVAQAMSLIED 

VLAAALADHLPEDKWSAEKRRPLKSSLGYEITFS 

LLNPDPKSHDVYWDIEGAVRRYVQPFLNALGAA 

GNFSVDSQILYYAMLGVNPRFDSASSSYYLDMH 

SLPHVINPVESRLGSSAASLYPVLNFLLYVPELAH 

SPLYIQDKDGAPVATNAFHSPRWGGIMVYNVDS 

KTYNASVLPVRVEVDMVRVMEVFLAQLRLLFGI 
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PCTAJS01/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine G=Cysteine, D-Aspartic Acid, 
E*=Glutamic Acid, F=Phenylalanine, G=Glycinc, H^Histidine, 
Wsoleucine, K^Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P^ProIine, Q=Giutnminc, R=Arginine, S^erine, 
T=Threonine, V^Valine, W=Tryptophan, Y*=Tyrosine, 

X^Unk nnvvn *=Stnn rodfin /=nnccihlf* niirltf*ntirlp i\*\t*+\f\n 

\=possible nucleotide insertion 










AQPQLPPKCLLSGPTSEGLMTWELDRLLWARSV 
ENLATATTTLTSLA 


3883 


A 


2369 


844 


RIHRJEEDFQFELKGIARLLSNPLLQTYLPNSTKKIQ 

FHQELLVLFWKLCDFNKVGQPRGALQGDGEQLP 

Q*PGGRDSVRLRGVGQSCPSLELSPLGPSPHP*KF 

LFFVLKSSDVLDBLVPILFFLNDARADQSRVGLM 

fflGVFILLLLSGECNFGVRLNKPYSIRVPMDPVF 

TGTHADLLIVXVFHKIITSGHQRLQPLFDCLLTIVV 

NVSPYLKSLSMVTANKLLPILLEAFSTTWFLFSAA 

QNHHLVFFLLEVn^IIQYQroGNSNLVYAIIRKR 

SEFHQLANLPTDPPTIHKALQRRRRTPEPLSRTGS 

QGGAPPWRAPAPLPLQSQAPSRPVWWLLQALTS 

*PRSPRCQRMAPCGPWNLSPSRAWRMAARLRGS 

PARHGGSSGDRP/HSSASGQWSPTPEWVLSWKS 

KLPLQTIMRLLQVLVPQVEKICIDKGLTDESEILR 

FLQHGTLVGLLPVPHPILIRKYQANSGTAMWFRT 

YMWGVIYLRNVDPPVWYDTDVKLFEIQRV 


3884 


A 


1 


804 


NGPRAPFSQEGQSTGPPPLIPRLGQHGAQGRIPPL 
NPGQGPGPNKDDSRGPPNHHMGPMSERRHEQSG 
GPEHGPERGPLRGGQDCRGPPDRRGPHPDFPDDF 
SRPDDFHPDKRFGHRLREFEGRGGPLPQEEKWR 
RGGPGPPFPPDHREFSEGDGRGAARGPPGAWEG 
RRPGG*TFPPGSRGPTFS/SGAEEESFRRGAPPRHE 
. GRAPPRGRDGFPGPEDFGPEENFDASEEAARGRD 
LRGRGRGTPRGERVTKDTWSGRIGCRJHWL 


3885 


A 


3 • 


996 


GRRRAGPAHSARMYNMMETELKPPGPQQTSGG 

GGGNSTAAAAGGNQKNSPDRVKRPMNAFMVW 

SRGQRRKMAQEOTKMHNSEISKRLGAEWKLLSE 

TEKRPFIDEAKRLRALHMKEHPDYKYRPRRKTK 

TLMKKDKYTLPGGLLAPGGNSMASGVGVGAGL 

GAGVNQRMDSYAHMNGWSNGSYSMMQDQLG 

YPQHPGLNAHGAAQMQPMHRYDVSALQYNSM 

TSSQTYMNG/SRPTYSMSYSQQGTPGMAPGS\MG 

SVVKSEASSSPPVVTSSSHSRAPCQAGDLRDMIS 

MYLPGAEVPEPAAPSRLHMSQHYQSGPVPGTAI 

NGTLPLSHM 




A 


773 


317 


QCTQKAAEGYTQFYYVDVLDGKLACVNKCTKG 
mSQMNCNLGTCQLQRSGPRCLCPNTNTHWYW 
GETCEFNIAKSLVYGrVGAVMAVLLLALIILnLFS 
LSQ\RKRHRPESEGEADFGLENATNNFG\PTLETV 
DSGTELHIQ\RPEMVASTV 


3887 


A 


3 


466 


VDFRVKTLLVDNKCFVLQLWDTAGQERYHSMT 
RQLLRKADGVVLMYDITSQESFAHVRYWLDCL . 
QDAGSDGVVILLLGNKMDCEEERQVSVEAGQQL 
AQELGVYFGECSAALGHMLEPVVNLARSLRMQ 
EEGLKDSLVKVAPKRPPKRFGCCS 


3888 


A - 


3412 


3144 


QMDITOFSSSWNDGLAFCALLHTYLPAHIPYQEL 

NSQDKRROTMLAFQAAESVGIKSTLDINEMVRT 

ERPDWQNVMLYVTAIYKYFET 


3889 


A 


1 


1160 


LVVTAITAILAFPNEYTRMSTSELISELFNDCGLL 

UbbKLCUYbJSIKJrN i oKuOJbLPDKPAGVGVYSAM 

WQLALTLILKIVIT]FI7GMKIPSGLFIPS1^VGAI 

AGRLLGVGMEQLAYYHQEWTVFNSWCSQGAD 

CITPGLYAMVGAAACLGGVTRMTVSLVVIMFEL 

TGGLEYIWLMAAAMTSKWVADALGREGIYDA 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

jucuuun 

corresponding 

to first amino 

acid residue of v 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponuing 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A K Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=G)ycine, H=Histidine, 
l»lsoleutine, K-Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q=Glutnmine, R^Argininc, S=Serine f 
T«Threoriine, V«=VaIinc, W=Tryptophan, Y=Tyrosinc, 
X=UnUnown, *=Stop codon, ^possible nucleotide deletion, 
V=possible nucleotide insertion 










HIRLNGYPFLEAK£EFAHKTLAMDVMKPRRNDP 

QRLVGFVLRRDLnSIENARKKQDGVVSTSIIYFTE 
HSPPLPPYTPPTLKLRMLDLSPFTVTDLTPMEIVV 
DIFRKLGLRQCLVTHNGRLLGnTKJroVLKHIAQ 
MANODPDSTLFN 


3890 


A 


1 


387 


SWCWTGIFVLGTTNLRLEGSWYRSLWGPGFNTT 
TATLGFGAPQAPVGDVALNQPDMCVYRRGRKK 
RVPYTKLQLKELENEYAINKJFINKDKJUIRISAAT 
NLSERQVTIWFQNRRVKDKKIVSKLKDTVS 


3891 


A 


2 


2914 


RGGGGDHKMADLSLLQEDLQEDADGFGVDDYS 

SESDVIIIPSALDLAST/QDEMVERPLGRL\DK\YA 

ASENHI*PDKMVAPEFASIPLRE\VCDDERDCIAV 

LGKN*PDWADDSEPT\VRAAELEQVPHIALFLFK 

KTRLSITICFFSKFLLPYCGLDTLADQN\NQVRKT 

SQAALL\ALLEQELIERFDVETKVCPVLIELTAPDS 

NDDVKTEAVAIMCKMAP\MVGKDITERLILPRFC 

EMCCDCRMFHWRKWCAANFGDICSWGQQAT 

EEMLLPRPFQLCSDNVWGVRKACAECFMAVSC . 

ATCQEIRRTKLSALFINLISDPSRWVRQAAFQSLG 

PFJSTFANPSSSGQYFKEESKSSEEMSVENNKRTR 

DQEAPEDVQVRPEDTPSDLSVSNSSVILENTMED 

HAAEASGKPLGEISVPLDSSLLCTLSSESHQEAAS 

NENDKKPGNYKSMLRPEVGTTSQDSALLDQELY 

NSFHFWRTPLPEIDLDIELEQNSGGKPSPEGPEEE ' 

SEGPVPSSPNITMATRKELEEMIENLEPHIDDPDV 

KAQVEVLSAALRASSLDAHEETISIEKRSDLQDE 

LDINELPNCKmQEDSVPLISDAVENMDSTLHYIH 

NDSDLSNNSSFSPDEERRTKVQDWPQALLDQY 

LSMTDPSRAQTVDTEIAKHCAYSLPGVALTLGR ' 

QNWHCLRETYETLASDMQWKVRRTLAFSIHELA 

VILGD\QLTAADLVPIFNGFLK*PSMKSRIGVLKH 

LHDFLKLLHIDKRREYLYQLQEFLVTDNSR2WR 

FRAELAEQLILLLELYSPRDVYDYLRPIALNLCAD 

IV v V iv W i i\JL V o fc,M V JsJ^HAA irr lrOVlJLlN 

ELVENFGRCPKWSGRQAFVFVCQTVIEDDCLPM 

DQFAVHLMPHLLTLANDRVPNVRVLLAKTLRQT 

LLEKDYFLASASCHQEAVEQTIMALQMDRDSDV 

KYFASIHPASTKISEDAMSTASSTY 


3892 


A 


158 


2191, 


VPLPAPSGLSGGGSRGAGCKKAPPGRAPAPGLAP 
LRPSEPTMAVPPGHGPFSGFPGPQEHTQVLPDVR 
LLPRRLPLAFRDATSAPLRiaSVDLIKTYKHINEV 
YYAKKKRRAQQAPPQDSSNKKEKKVLNHGYDD 
DNHDYIVRSGERWLERYEIDSLIGKGSFGQVVKA 
YDHQTQELVAIKIIKNKKAFLNQAQIELRLLELM 
. NQHDTEMKYYTVHLKRHFMFRNXHLCLVFELLS 
YNLYDLLRNTHFRGVSLNLTRJCLAQQLCTALLF 
LATPELSUHCDLKPENILLCNPBCRSAIKIVDFGSS 
CQLGQRIYQYIQSRPYRSPEVLLGTPYDLAIDMW 
SLGCILVEMHTGEPLFSGSNEVCPQEGVDQMNRI 
VEVLGPPAAMLDQAPKARKYFERLPGGGWTLR 
RTKELRKDYQGPGTRRLQEVLGVQTGGPGGRRA 
GEPGHSPAD\Y\LRFQDLVLRMLEYEPAARISPLG 
ALQHGFFRRTADEATNTGPAGSSASTSPAPLDTC 
PSSSTASSISSSGGSSGSSSDNRTYRYSNRYCGGP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid. residue of 

peptide 

sequence 


Predicted end . 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysteine, D=Aspartic Acid, 
EXSlutamic Acid, F=PhcnyIalanine, G=GJycine, H=Histidine, 
I-Isoleucinc, K=Lysine, L=Lcucine, M^Methionine, 
N=Asparagine, P=ProIine, Q=G)u famine, R^Arginine, S=Serine, 
T=Threonine, V=Valine, W<=Tryptophan, V=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










nPPTTriPPMTsJCPO\7T>PQr^DT pp\Tt/ a r:rn\;DUFTLr 
\JXrri 1 LJK^&iVuy olrv£ V rroyrLlvr W /VOkjL/ V rrlis. i ri 

QAPASASSLPGTGAQLPPQPRYLGRPPSPTSPPPP 

FT MTW^T VfinPAnr^PPWPAPAPnHPAAQAT PT 

RMTGGRPPLPPPDDPATLGPHLGLRGVPQSTAAS 
S 


3893 


A 


68 


258 


PEEYYPFSPTLQQLFFFLLDSDMGSRPESMGCRK 
NTVPRPASPTEAGTDPQTFLHTWVSECRD 


3894 


A 


1120 


136 


SLPLAPAPAVAGPVALCPAGLCPAQPGMPAGPA 

AASGSHPEVGSVLQRSSQPHWPNPWPGAGHLPP 

PAGPFPYNPPAGPGAAAGLA*SPPRSSPTPCSVGP 

QSCPANASAPPAQPCLAGAPPAASLPPPGPGSVS 

AAPAPGGPAPAEPPLGVPPVPAWLLPDSPPLPGT 

TTQrfTPPP A A VCT PPA AAA r , P\A7\/PPPT DUXJPDrM "DC 

PSAAAPNPGCAGGIRHFPPGSPEASSPLRPAAAPA 
LLPLPRPPS*P/VPWKPLHSPVAVAGGSFVAGGSV 
LPAPDLDQPRPSGPPAASPTPGPGVAQPPPGSAVL 
PTVP*APPVSGAAPGRKREW 


3895 


A 


2 


1347 


FGAVSYRPGNGSCWVKVTASSDLSDLISCLCPPR 

SLCSSQACVLPVPGPSLLLPQGLHVGCASAGTRW 

PLSCSIDFQRLLAHEEETQKRRAKESGMAFTQLT 

FRDVAIEFSQDEWKCLNSTQRTLYRDVMLENYR 

NLVSLDLSRNCVIKELAPQQEGNP/ARSIPHSDIGT 

T* KT*H* RVLLQGNQEKNTRL* LS VER* *KKLQQ . 

SD YGPKRKS YL*ERPTR*KRY RKQVY * TS A\* LSF 

LPHPHELQQFQAEGKIYECNHVEKSVMHGSSVSP 

PQIISSTIKTHVSNKYGTDFICSSLLTQEQKSCIRE 

DLCGKVFSQKSNLARHWRVHTGEKPYKCNECD 
RSFSRNSCLALHRRVHTGEKPYKCYECDKVFSR 

HQVIHSDK 


3896 


A 


202 


498 


MVQSCSAYGGKNRYDKDKPVSFHKFPLTRPSLC 
KEWEAAVRRKNFKPTKYSSICSEHFTPDCFKREC 


3897 


A 


2 


382 


SHGLSRAPHLSAAPAPALASRPCFSSAPCSQGGG 
GGGPATMIHF1LLFSRQGKLRLQK\?VTITLPDKER 
KKITREIVQnLSRGHRTSSFVDWKELKLVYKRYA 


3898 


A 


718 


305 


SEQEPLLGDTPGSREWDILETEEHYKSRWRSIRIL 
YLTMFLSSVGFSWMMSIWPYLQKIDPTADTSFL 
GWVIASYSLGQMVASPIFGLWSNYRPRKEPLIVSI 
LISVAANCLYAYLHIPASHNKYYMLVARGLLGIG 


3899 


A . 


24 . 


718 


FRGRPGPEREGKGNHSFVEVARV1VVDLHSRLG 
GAMAERKGTAKVDFLKK1EKEIQQKWDTERVFE 
v iN/\oiNi^dxVi i ojvoiv i r v lrr iri ivirNOJSJLnJLAjJti I 
FSLSKCEFAVGYQRLKGKCCLFPFGLHCTGMPIK 
ACADKLKRECELY/GCPPDFPDEEEEEEETSVKTE 
DniKDKAKGKKSKAA/AXAGSSKYQWGIMKSLG 

T ^riFFTVTfFQF AF*H"\\/T TWTTMAI ATOrM VP1WIY1 
X^OXJ CtCf Jl V JVT OXZ/r\IZfTl W LtXJ I JT JN /\ JU/i J v^L/i^ JVIVl Vi KJ 


3900 • 


A 


360 


1 


VPATSSNVSPSSSESSEPDLSSRSSSSDAPSSSPSVP 
SPCSLSLSSPESPLLPTLLSSKSPAGSAGPTCGCPS 
GPGLRATA/PSRLSSSIAAH/SSSAPETSRPAAARE 
RSPPLHDRESHE 


3901. 


A 


193 


345 


GEWAVPPAPGGQGVSIPHGPEPGQGSGVHIAPRQ 
GEGSDRTEPL1CPKAAP 



463 



WO 01/57190 



PCT/US01/04098 



SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

corresponding 
to first amino 
acid residue of 
peptide . 
sequence 


Predicted end 
nucleotide 
location 
Lurrcsponuing 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D^Aspartic Acid, 
E=G!utamic Acid, ^Phenylalanine, G=Glycine, H=Htstidine, 
I=lsolcucinc, K>=Lysinc, L=Lcucine, M=Mcthionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Argininc, S=Serine, 
T=Thrconinc, V=VaJine,\V=Tryptophan, Y=Tyrosine, 
X«Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 


3902 


A 


1188 


1389 


NPAARSAAAREGSPALPPPPVS/SSSGLGLLLPLSP 
PGSRAAMPAT KPRAPHSRVRPPPPPrJPRTJRPP 


3903 


A 


63 ' 


396 


>mMRNPHLSSNHYL^ARTETVFARMESVKQRI 

LAPGKEGLKNFAGKSLGQIYRVLEKKQDTGETDE 

LTEDGKPL*VPERKAPLCDCTCFGLPRRYIIAIMS 
GLGFCISFG 


3904 


A 


732 ■ 


1046 


AMSECPLILYIHKHIDTYSQSYLFNDLFYPVYSGG 
RMVTYEHLREVVFGKSEDEHYPLW*VLFGK*YA 
VAPNALMFIRFM*NCTFVPKLP* VMDLK* *LQYK 
SR 


3905 


A 


46 


910 


QPPPPPPPPPSPPPPPFPPARALSHLRLHPDACLFPS 
PFPLPCSTMPGMMEKGPELLGKNRSANGSAKSP 
AGGGGSGASSTNGGLHYSEPESGCSSDDEHDVG 
MRVGAEYQARIPEFDPGATKYTDKDNGGMLVW 

9PVT4^TP*n A VI TYPV7 ATA l^TJVXJ/^VTvTvrn^ A T n'Kjn tt 

WHKK^EKSLADLPNFTPrTDEWTVEDKVLFEQ 
AFSFHGKSFHRIQQMLPDKTIASLVKYYYSWKK 
TRSRTSLMDRQARKLANRHNQGDSDDDVEETHP 
MDGNDSDYDPKKEAKKEGMS 


3906 


A 


2 


■j i j 


GHNHPGELGWENPNEWSQEAAISLISEEEDDTSS 
EATSSGKSJDDYGFISAILFLVTGILLVIISYrVPREV 
TVDPNTVAAREMERLEKESARLGAHLDRCVIAG 


3907 


A 


71 


412 


!LlMSNCLQNrl,KITSTRLLCSRLCQQLRSKRKFF 
GTVPISRLHRRWITGIGLVTPLGVGTHLVWDRLI 
GGESGIVSLVGEEYKSIPCSVAAYVPRGSDEGQF 
NEQNFVSKSD 


3908 


A - 


77 


746 


LGTLLGWRAPLFSRCLAFHSPFELLNTPKLVKTAE 
LPPDRNYVLGAHPHGIMCTGFLCNFSTESNGFSQ " 

RQSLDFILSQPQLGQAVVTMVGGAHEALYSVPGE 
HCLTLQKRKGFVRLALRHGASL VPVYSFGENDIF 
RLKAFATGSWQHWCQLTFKKLMGFSPCIFWGR 
GLFSATSWGLLPFAVPITTV 


3909 


A 


1 ■ 


793- 


FRAAGRPAAAMGDIPVVGLSSWKASPGKVTEAV 

KEAJJDAGYRHFDCAYFYHNEREVGAGIRCKIKE 

GAVRREDLLIATKLWCTCHKXSLVETACRXSLK 

AT VT XTVT "HT VT TTTYX rD'K /imTV - D"DT_TD"C W n\AC O t?T cc 
/\-L-isjl-in i L»UL> I Lflxx W r JVLOr JSJrrrirc, W liVLo U 0.D.L0 r 

CLSHPRVQDLPLDESmiVIPSDTDFLDTWEAME . 
DLVITGLVKMGVSNFrmEQLERLLNKPGLRFKP 
LTNQIECHPYLTQKNLISFCQSRDVSVTAYRPLG 
GSCEGVDLEDNPVIKPJAKEHGKSPAQILI 


3910 


A 


202 ■, 


705 


FFTMHRKKVDNRIRILIENGVAERQRSLFVVVGD 

tssjjsjj v v 11^ ruTjvjJL o isJ\ I V JSAKrb V L WLY KKliL 

GFSSHRKKRMRQLQKKIKNGTLNIKQDDPFELFI 

AATNIRYCYYNETHKILGNTFGMCVLQDFEALTP 

NLLARTVETVEGGGLVVJXLRTMNSLKQLYTVT 

M 


3911 


A 


3 


723 


AGRGARAAGEGGGPFKSRPRPLPSSRSLPAVGGG 

RYGADKMAAGGAVAAAPECRLLPYALHKWSSF 

SSTYLPENILVDKPNDQSSRWSSESNYPPQYLILK 

LERPAIVQMTFGKYEKTHVCNLKKFKVFGGMN 

EENMTELLSSGLKNDYNKETFTLKHKJDE^ 

RITXIVPLLSWGPSFNFSn\rA^ELSGJl)DPDIVQPC 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

Inrfitinn 

corresponding 
to first amino 
acid residue of 
peptide' 
sequence 


Predicted end 
. nucleotide 
location 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc C=Cysteine, D=Aspartic Acid, 
EKSlutamic Acid, F=PhcnylaIanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
i>=Asparagine, jr^rroune, \i~ valuta mine, K^Arginine, S = Scrinc, 
T=Threonine, V^Valine, W«=Tryptophan, V=Tyrosine, 
X-Unknown, *«Stop codon, /"possible nucleotide deletion, 
V=possible nucleotide insertion 










LNWSKYREQEAIRLCLKHFRQHNYTEAFESLQ 
KKT 


3912 


A 


2 


461 


LFALAHAAFSAAQHRSYMRLTEKEDESLPIDIVL 
QTLLAFAVTCYGIVHIAGEFKDMDATSELKNKTF 
DTVRNHPSFYVFNHRGSEYFSGPSDTANSSNQDA 
LSSNTSLKLRKJLESLRR 


3913 


A 


362 


20. 


APGRPEAKVPERSRESGSRRVRGPLLQLRPGRTS 
•RPASGRGRGGAGGSYGKMRKPDSKIVLLGDMN 
VGKTSLLQRYMERRFPDTVSTVGGAFYLKQWRS 
YNISIWDTAGEAGAA 


3914 


A . 


1 


7545 


PGIRVGITSQTGLSSNLQENCSKLAFISSHGTEKQ 

LQCMPMEGRGRASSSISDLQGKGFEKGTGEKHV 

PGVGSARHSPQASAGGSPWQRGKAQTRWLGKP 

DPGRKRRRGSPQEEGGLRVSAAARLLCSGANRC 

KVLVRQNSTPNTQQPAVHPSTPPSRPLPQAGRCL 

VAPLRPHPDWVAAKTLAKALRAPGKPWRLAAP 

SPLGDLGAPGLPGPSTAPRTLSVEEPGVECNQLC 

LYADVTDPVLCLGQKDPGVEGKHCEKEKISSSK 

ELKHVHAKSEPSKPARRLSESLHVVDENKNESKI 

EREHKRRTSTPVIMEGVQEETDTRDVKRQVERSE 

ICIEEPQKQKSTLKNEKHLKKDDSETPHLKSLLK 

ICEVKSSKEKPEREKTPSEDKLSVKHKYXGDCMH 

KTGDETELHSSEKGLKVEENIQKQSQQTKLSSDD 

KTERKSKHRNERKLSVLGKDGKPVSEYTIKTDEN 

VRKENNKKERRLSAEKTKAEHKSRRSSDSKIQK 

DSLGSKQHGITLQRRSESYSEDKCDMDSTNMDS 

NLKPEEVVHKEKRRTKSLLEEKLVLKSKSKTQG 

KQVKVVETELQEGATKQATTPKPDKEKNTEEND 

SEKQRKSKVEDKPFEETGVEPVLETASSSAHSTQ 

KDSSHRAKLPLAKEKYKSDKDSTSTRLERKLSD 

GHKSRSLKHSSKDIKKKDENKSDDKDGKEVDSS 

HEKARGN S SLMEKKL S RRLCENRRG SLS QEMAK 

GEEKLAANTLSTPSGSSLQRPKKSGDMTLIPEQEP 

MEIDSEPGVENVFEVSKTQDNRNNNSHQDIDSEN 

MKQKTSATVQKDELRTCTADSKATAPAYKPGR 

GTGVNST^SEKHADHRSTLTKKMmQSAVSKMNP 

GEKEPIHRGTTEVNIDSETVHRMLLSAPSENDRV 

QKNLKNTAAEEHVAQGDATLEHSTNLDSSPSLSS 

VTVVPLRESYDPDVIPLFDKRTVLEGSTASTSPAD 

HSALPNQSLTVRESEVLKTSDSKEGGEGFTVDTP. 

AKASITSKRHIPEAHQATLLDGKQGKVIMPLGSK 

LTGVIVENENITKEGGLVDMAKKENDLNAEPNL 

KQTIKATVENGKKDGIAVDHVVGLNTEKYAETV 

KLKHKRSPGKVKDISIDVEREWENSEVDTSAGSG 

SAPSVLHQRNGQTEDVATGPRRAEKTSVATSTE 

GKDKDVTLSPVKAGPATTTSSETRQSEVALPCTS 

IEADEGLIIGTHSRNNPLHVGAEASECTVFAAAEE 

GGAVVTEGFAESETFLTSTKEGESGECAVAESED 

RAADLLAVHAVKIEANVNSVVTEEKDDAVTSAG 

SEEKCDGSLSRDSEIVEGTITFISEVESDGAVTSAG 

TEIRA G SI S S EE VD G SQ GNMMRMGPKKETEGTV ' 

TCTGAEGRSDNFVICS VTGAGPREERMVTGAGV . 

VLGDNDAPPGTSASQEGDGSVNDGTEGESAVTS 

TGITEDGEGPASCTGSEDSSEGFAISSESEENGESA 
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SEQ ID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence - 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanme C=Cysteine, D=Aspartic Acid, 
E=Glutaraic Acid, F=Phenylalanine, C=Clycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Lc urine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutaraine, R=Arginine, S<=Serine, 
T^Threonine, V^Valine, W«Tryptophan, Y=Tyrosine, 
X=Unkno\vn, *=Stop cod on, /=possiblc nucleotide deletion, 
\=possib!e nucleotide insertion 










MDSTVAKEGTNVPLVAAGPCDDEGIVTSTGAKE 

EDEEGEDWTSTGRGNEIGHASTCTGLGEESEGV 

LICESAEGDSQIGTVVEHVEAEAGAAIMNANENN 

VDSMSGTEKGSKDIDICSSAKGIVESSVTSAVSG 

KDEVTPVPGGCEGPMTSAASDQSDSQLEKVEDT 

TISTGLVGGSYDVLVSGEVPECEVAHTSPSEKED 

EDHTSVENEECDGLMATTASGDITNQNSLAGGK 

NQGKVLnSTSTTKDYTPQVSAITDVEGGLSDALR 

TEENMEGTRVTTEEFEAPMPSAVSGDDSQLTASR 

SEEKDECAMISTSIGEEFELPISSATTIKCAESLQP 

VAAAVEERATGPVLISTADFEGPMPSAPPEAESP 

LASTSKEEKDECALISTS1AEECEASVSGWVESE 

NERAGTVMEEKDGSGHSTSSVEDCEGPVSSAVP 

QEEGDPSVTPAEEMGDTAMISTSTSEGCEAVMIG 

AVLQDEDRLTITRVEDLSDAAIISTSTAECMPISA 

SIDRHEENQLTADNPEGNGDLSATEVSKHKVPM 

PSLIAENNCRCPGPVRGGKEPGPVLAVSTEEGHN 

GPSVHKPSAGQGHPSA VCAEKEEKHGKECPEIGP 

FAGRGQKESTLHLINAEEKNVLLNSLQKEDKSPE . 

TGTAGGSSTASYSAGRGLEGNANSPAHLRGPEQ 

TSGQTAKDSS VSSIRYLAA VNTGAIKADDMPPVQ . 

GWAEHSFLPAEQQGSEDNLKTSTTKCITGQESKI 

APSHTMIPPATYSVALLAPKCEQDLTIKNDYSGK 

WTDQASAEKTGDDNSTRKSFPEEGDIMVTVSSE 

ENVCDIGNEESPLNVLGGLKLKANLKMEAYVPS . 

EEEKNGEILAPPESLCGGKPSG1AELQREPLLVNE 

AISGHSXHEADPKEVEEEERHMPKRKRKQHYLSSE 
DEPDDNPDVLDSRJETAQRQCPETEPHATKEENS 
RDLEELPKTSSETNSTTSRVMEEKDEYSSSETTGE 
KPEQNDDDTIKSQE 


3915 


A 


1 


7545 • 


PGIRVGITSQTGLSSNLQENCSKLAFISSHGTEKQ 

LQCMPMEGRGRASSSISDLQGKGFEKGTGEKHV 

PGVGSARHSPQASAGGSPWQRGKAQTRWLGKP 

DPGRKRRRGSPQEEGGLRVSAAARLLCSGANRC , 

KVLVRQNSTPNTQQP AVHPSTPPSRPLPQ AGRCL 

VAPLRPHPDWVAAKTLAKALRAPGKPWRLAAP 

SPLGDLGAPGLPGPSTAPRTLSVEEPGVECNQLC 

LYADVTDPVLCLGQICDPGVEGICHCEKEKISSSK 

ELKHVHAKSEPSKPARRLSESLHVVDENKMESKI 

EREHKRRTSTPVIMEGVQEETDTRDVKRQVERSE 

ICTEEPQKQKSTLKNEKHLKKDDSETPHLKSLLK 

KEVKSSKEKPEREKTPSEDKLSVKHKYKGDCMH 

KTGDETELHSSEKGLKVEEMQKQSQQTKLSSDD 

KTERKSKHRNERKLSVLGKDGKPVSEYIIKTDEN 

VRKENNKKERRLSAEKTKAEHK^RRSSDSKIQK 

DSLGSKQHGITLQRRSESYSEDKCDMDSTNMDS 

NLKPEEVVHKEKRRTKSLLEEKLVLKSKSKTQG 

KQVKVVETELQEGATKQATTPKPDKEKNTEEND 

SEKQRKSKVEDKPFEETGVEPVLETASSSAHSTQ 

KDSSHRAKLPLAKEKYKSDKDSTSTRLERKLSD 

GHKSRSLKHSSKDIKKKDENKSDDKDGKEVDSS 

HEKARGN S SLMEKKLSRRLCENRRG SLSQEMAK 

GEEKLAANTLSTPSGSSLQRPKKSGDMTLIPEQEP 

MEIDSEPGVENVFEVSKTQDNRNNNSHQDIDSEN 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCystcine, D=Aspartic Acid, 
E^GIutamic Acid, F=Pheny!aIanine, G«Glycine, M=Histidine, 
I-Isoleucine, K«Lysine, L=Leucine, M^Methionine, 
N=Asparagine, PHProline, Q=Glutamine, R=Arginine, S=Serine, 
TVThreonine, V^Valine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *«=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










MKQKTSATVQKDELRTCTADSKATAPAYKPGR 

GTGVNSNSEKHADHRSTLTKKMHIQSAVSKMNP 

GEKEPIHRGTTEVNIDSETVHRMLLSAPSENDRV 

QKKLKNTAAEEHVAQGDATLEHSTNLDSSPSLSS 

VTVVPLRESYDPDVIPLFDICRTVLEGSTASTSPAD 

HSALPNQSLTVRESEVLKTSDSKEGGEGFTVDTP 

AKASITSKRHIPEAHQATLLDGKQGKVIMPLGSK 

LTGVIVENENITKEGGLVDMAKKENDLNAEPNL 

KQTDCATVENGKKDG1AVDHVVGLNTEKYAETV 

KLKHKRSPGKVKDJSIDVERRNENSEVDTSAGSG 

SAPSVLHQRNGQTEDVATGPRRAEKTSVATSTE 

GKDKDVTLSPVKAGPATTTSSETRQSEVALPCTS 

DEADEGLIIGTrlSRNNPLHVGAEASECTVFAAAEE 

GGAVVIEGFAESETFLTSTKEGESGECAVAESED 

RAADLLAVHAVKIEANVNSVVTEEKDDAVTSAG 

SEEKCDGSLSRDSEIVEGTITFISEVESDGAVTSAG 

TEIKAGSISSEEVDGSQGNMMRMGPKKETEGTV 

TCTGAEGRSDNFV1CSVTGAGPREERMVTGAGV 

VLGDNDAPPGTSASQEGDGSVNDGTEGESAVTS 

TGITEDGEGPASCTGSEDSSEGFAISSESEENGESA 

MDSTVAKEGT3WPLVAAGPCDDEGI VTSTGAKE 

EDEEGEDWTSTGRGNEIGHASTCTGLGEESEGV 

LICESAEGDSQIGTVVEHVEAEAGAAIMNANENN 

VDSMSGTEKGSKDTDICSSAKGIVESSVTSAVSG 

KDEVTPVPGGCEGPMTSAASDQSDSQLEKVEDT 

TISTGLVGGSYDVLV SGEVPECEVAHTSPSEKED 

EDIITSVENEECDGLMATTASGDITNQNSLAGGK 

NQGKVLnSTSTTNDYTPQVSAITDVEGGLSDALR 

TEENMEGTOVTTEEFEAPMPSAVSGDDSQLTASR 

SEEKDECAMISTSIGEEFELPISSATTDCCAESLQP 

VAAAVEERATGPVLISTADFEGPMPSAPPEAESP 

LASTSKEEKDECALISTSIAEECEASVSGVVVESE 

NEFLAGTVMEEKDGSGIISTSSVEDCEGPVSSAVP 

QEEGDPSVTPAEEMGDTAMISTSTSEGCEAVMIG 

AVLQDEDRLTITRVEDLSDAAIISTSTAECMPISA 

SIDRHEENQLTADNPEGNGDLSATEVSKHKVPM 

PSLIAENNCRCPGPVRGGKEPGPVLAVSTEEGHN 

GPSVHKPSAGQGHPSAVCAEKEEKHGKECPEIGP 

FAGRGQKESTLHLINAEEKNVLLNSLQKEDKSPE 

TGTAGGSSTASYSAGRGLEGNANSPAHLRGPEQ 

TSGQTAKDSSVSSIRYLAAVNTGAIKADDMPPVQ 

GTVAEHSrT^PAEQQGSEDNLKTSTTKCITGQESKI 

APSHTMIPPATYSVALLAPKCEQDLTIKNDYSGK 

WTDQASAEKTGDDNSTRKSFPEEGDIMVTVSSE 

ENVCDIGNEESPLNVLGGLKLKANLKMEAYVPS 

EEEKNGEILAPPESLCGGKPSGIAELQREPLLVNE 

SLNVENSGFRTNEEIHSESYNKGEISSGRKDNAE 

AISGHSVEADPKEVEEEERHMPKRKRKQHYLSSE 

DEPDDNPDVLDSRIETAQRQCPETEPHATKEENS 

RDLEELPKTSSETNSTTSRVMEEKDEYSSSETTGE 


3916 


A 


2 


773 ' 


GPFGVLWPSAKPGPVTAVEARPPDASDPEGLRG • 
GSPAPLLAPGPLDPSGRLHPAVSMMSYLKQPPYG 
MNGLGLAGPAMDLLHPSVGYPATPRKQRRERTT 
FTRSQLDVLEALFAKTRYPD1FMREEVALKINLPE 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

I AAA f!nn 

location 

corresponding • 
to first amino 
acid residue of 
peptide 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alaninc OCystcinc, D=Aspartic Acid, 
E=Glutamic Acid, ^-Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucinc, K=Lysine, L=*Leucine, M=Methioninc, 
N = Asparagine, P=Proline, Q=Glutamine, R^Arginine, S=Scrine, 
T=Threonine, V=Valine, W=Tryptopban, Y«Tyrosine, 
X=Unknown, *«Stop codon, A=possible nucleotide deletion, 
^possible nucleotide insertion 










SRVQVWFKNRRAKCRQQQQSGSGTKSRPAKKK 
SSPVRESSGSESSGQFTPPAVSSSASSSSSASSSSA 
NPAAAAAAGLWAKLPCPLHIFSLCVFIEENRLV 
SGSWARDIRSVEETDKSGYR 


3917 


A 


2 


776 


R3SFIPGRRFRPPGLRRLLKGPHMPREPRGYRTRVP 
ALRELVPSSHAGSGASEHCQNNRQGSRQHRASR 
NVQAGGALAPPRHLCGLCSRLHFLKPDLSVRAA 
PSRAGASVMALRKELLKSTWYAFTAT DVFK^frK 
VSKSQLRVLSHNLYTVLHIPrlDPVALEEHFRDDD 
DGPVSSQGYMPYLNKYELDKVEEGAFVKEHFDE . 
LCWTLTAKX>m^DSNGNSMLSNQDAFRLWCL 
FNFLSEDKYPLIMDPDEGEYLLKRYS 


3918 


A 


10 


318 . 


WQDLVCLGGSRAQEQKPLQQLWNAILLVAMLL 
CTGLWQAQRQASRQSQRELGGQVDLFKRRVV 
RRLASLKTRRCRLSRAAOGLPDPGAETCAVCLD 
YFCNKQ 


3919 


A 


1 


204 


RVLTAINHTLKENLRKFYKGKKDKPLDLRPKKT 
RAMRRRLNMHEEISfLKTKKQHRKERLYPLRKYA 
AKA 


3920 


A 


1 


654 


RCCRSFVAPLQEKVVFGLFFLGAILCLSFSWLFHT 
VYCHSEGVSRLFSKLDYSG1ALLIMGSFVPWLYY 

SFYPTsJPOPPFTVT TVTPVT f*TA ATlV<sn\\/ri\/rF ATPO 
or i wix r\gr un i JUi v iu v i^vji/i-riii v <-> v<j Yv 1J ivlt J\ i x\£ 

YRGVRAGVFLGLGLSGIPTLHYVISEGFLKAATI 

GQIGWLMLMASLYITGAALYAARJPERFFPGKCD 

IWFHSHQLFmFVVAGAFVHFHGVSNLQEFRFMI 

GGGCSEEDAL 


3921 


A 


1587 


452 . 


LERDGCGGEEGGSVRSGAGPDSDPRGASSPPAG 

HRGTAASPRPVAAPSRTPAPPHTRARASPGLPSG 

PA\\01RVQWFSRVSGQVSTLMKATVLMRQPGRV 

QEIVGALRKGGGDRLQVISDFDMTLSRFAYNGK 

RCP S S YNILDNSKIISEECRKELTALLHH YYPIEID 

PmWKEKLPHMVEWWTKAHNLLCQQ^ 

AQVVraSNAMLREGYKTFFOTLYHNNIPLFIFSA 

GIGDTLFFTTR 0"N/n<r VFlTPlsHl-JTV^'N VMnFNFnfiFT 

VJAVJi^lJjliJ^iJLJ\.V^IVXJS. V 17 xlx IN JLCTi V OiN I lvlJL/r IN CtUKJr X-i 

QGFKGQLIHTYNKNSSACENCGYFQQLEGKTNV 
ILLGDSTGDT TMADGVPGVONTT TH^FT MnTiTVPF 

Xl~il-i\-*X*S *JX\JX~rX*i X iYJJTJU'VJ V X \J V V^IN JJ^ISJLVJF LjIN J-Tv V JDJCi 

RRERYMDSYDrVLEKDETLDVVNGLLQHILCQG 
YQLEMQGP 


3922 


A 


2 


164 


GK3YQRAFGGHSLKFGKGVQAHGCCCVADRTG 
HSILHTSYGRERPAPVHLRQDT 


3923 


A 


2 


3258 


EHATHAYA1>XGTRRRHREVTVFVPTWQLKKNR 

RVI^SHFLTKLHSLKIVILSITPSQLENGKKITTYD 

YRFMVI<XAEETDGnVTNEQmiLMNSSI<^MVK 

DRLLPFTFAGNLFMVPDDPLGRDGPTLDEFLKKP 

NRLDTDIGNFLKVWKTLPPSSASVTELSDDADSG 

PLESLPNMEEVREEKEERQDEEQRQGQGTQKAA 

EEDDLDSSLASVFRVECPSLSEEILRCLSLHDPPD 

GALDIDLLPGAASPYLGIPWDGKAPCQQVLAHL 

AQLTlPSNFTALSrT^GFMDSHRDAIPDYEALVG 

PLHSLLKQKPDWQWDQEHEEAFLALKRALVSAL 

CLMAPNSQLPFRLEVTVSHVALTAILHQEHSGRK 

HPIAYTSKPLLPDEESQGPQSGGDSPYAVAWALK 

HFSRCIGDTPV VLDLS YA SRTTADPEVREGRRVS 

KAWLIRWSLLVQDKGKRALELALLQGLLGENRL 

LTPAASMPRFFQVLPPFSDLSTFVCIHMSGYCFYR 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D^Aspartic Acid, 
E^GIutamic Acid, ^Phenylalanine, G=GJycine, H=Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, M^Methionine, 
N^Asparagine, P=ProIine, Q=Glutamine, R=Arginine,S=Serine, 
T=Threonine, V^Valine, W^Tryptopban, Y=Tyrosine, 
X-Un known, *=Stop cod on, /^possible nucleotide deletion, 
V= possible nucleotide insertion 










EDEWCAGFGLYVLSPTSPPVSLSFSCSPYTPTYA 

HLAAVACGLERFGQSPLPWFLTHCNWSLLWE 

LLPLWRARGFLSSDGAPLPHPSLLSYIISLTSGLSS 

LPFIYRTSYRGSLFAVTVDTLAKQGAQGGGQWW 

SLPKDVPAPTVSPHAMGKRPNLLALQLSDSTLAD 

IIARLQAGQKLSGSSPFSSAFNSLSLDKESGLLMF 

KGDKKPRVWVVPTQLRRDLIFSVHDEPLGAHQR 

PEETYKKLRLLGWWPGMQEHVKDYCRSCLFCIP 

RNLIGSELKVIESPWPLRSTAPWSNLQIEWGPVT 

ISEEGHKHVLIVADPNTRWVEAFPLKPYTHTAVA 

QVLLQHVFARWGVPVRLEAAQGPQFARHVLVS 

CGLALGAQVASLSRDLQFPCLTSSGAYWEFKRA 

LKEFIFLHGKKWAASLPLLHLAFRASSTDATPFK 

VT 7WSPCPT TT7DT YlfWFEAAQQ A "KTVCnX VKATWrVJ Y r\ 

V L, I Urvjlio JvJL 1 JlrL W W JcJVLo b AIN J_bU JU JSJVliJ V r LLij 

LVGELLELHWRVADKASEKAENRRFKRESQEKE 
WNVGDQVLLLSLPRNGSSAKWVGPFYIGDRLSL 
SLYRIWGFPTPEKLGCIYPSSLMKAFAKSGTPLSF 
KVLEQ 


3924 


A 


1 


1826 


MGSVTVRYFCYGCLFTSATWTVLLFVYFNFSEV 

TQPLICNVPVKGSGPHGPSPKKFYPRFTRGPSRVL 

EPQFBCANKIDDV1DSRVEDPEEGHLKFSSELGMIF 

NERDQELRDLGYQKHAFNMLISDRLGYHRDVPD 

TRNAACKEKFYPPDLPAASWICFYNEAFSALLR 

TVHSVIDRTPAHLLHEULVDDDSDFDDLKGELDE 

YVQKYLPGKIKVIRNTKREGLIRGRMIGAAHATG 

EYLVFLDSHCEVNVMWLQPLLAAIREDRHTVGC 

PVIDIISADTLAYSSSPVVRGGFNWGLHFKWDLV * 

PLSELGRAEGATAPDCSPTMAGGLFAMNRQYFH 

ELGQYDSGMDIWGGENLEISFRIWMCGGKLFIIP 

CSRVGHIFRKRRPYGSPEGQDTMTHNSLRLAHV 

WLDEYKEQYFSLRPDLKTKSYGNISERVELRKKL 

GCKSFKWYLDNVYPEMQISGSHAKPQQPEFVNR 

UrisJvr Js. V LyKuKL i rxLKl *■ V A^LrKrbyKO 

GLWLKACDYSDPNQIWIYNEEHELVLNSLLCLD 

MSETR5SDPPRLMKCHGSGGSQQWTFGKNNRLY 

QVSVGQCLRAVDPLGQKGSVAMAICDGSSSQQ 

WHLEG 


3925 


A 


5386 


2897 


VRWNSKTEC YLSIQTQENFP ANLhFEL VNCI VI SSL 

VTTQRKLKAMSLLGSRNQLARAVLNPNPMDFCT 

KDLLTTTSERIIAYLRDFNEDQKKAIETAYAMVK 

HSPSVAKICLErlGPPGTGKSKTTVGLLYRLLTENQ 

RKGHSDENSNAKIKQNRVLVCAPSNAAVDELM 

KKIILEFKEKCKDKKNPLGNCGDINLVRLGPEKSI 

NSEVLKFSLDSQVNHRMKKELPSHVQAMHKRK 

EFLDYQLDELSRQRALCRGGREIQRQELDENISK 

VSKERQELASKIKEVQGRPQKTQSIIILESHIiCCT 

LSTSGGLLLESAFRGQGGVPFSCVIVDEAGQSCEI 

ETLTPLHRCNKLILVGDPKQLPPTVISMKAQEYG 

YDQSMMARFCRLLEENVEHNMISRLP1LQLTVQ 

YRMHPDICLFPSNYVYNRl^KTNRQTEAIRCSSD 

WPFOPYLWDVGDGSERRDNDSYINVOEIKLVM 

EIIKLIKDKJUCDVSFRNIGnTHYKAQKTMIQKDL 

DKEFDRKGPAEVDTVDAFQGRQKDCVIVTCVRA 

NSIQGSIGFLASLQRLNVTITRAKYSLFILGHLRTL 

MENQHWNQLIQDAQKRGAIIKTCDKNYRHDAV 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
correspon d i ng 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H^Histidinc, 
I-Isoleudne,K=Lysine, L«=Leucine, MfMethionine, 

N = Aiina rApim 1 P=PrnlinP Oj=f^lntnminp R^Artjinini 1 OnC^rin^ 

* ' mt^4*i JlgllJl, A A I UUJJCf \^/~AJl U 1<I IIJJUCj gllJJIJC, ty^tyCi J IIC, 

T=Threonine, V^Valine, W=Tryptophan, Y«Tyrosine, 
X=Unknown, *=Stop cod on, /-possible nucleotide deletion, 
V=possibIe nucleotide insertion 










KJDLKLKPVLQRSLTHPPTIAPEGSRPQGGLPSSKL 

DSGFAKTSVAASLYHTPSDSKEITLTVTSKDPERP 

PVHDOLODPRLLKRMGIEVKGGIFLWDPOPSSPO 

HPGATPPTGEPGFPVVHQDLSHVQQPAAVVAAL 

SSHKPPVRGEPPAASPEASTCQSKCDDPEEELCH 

RREARAFSEGEQEKCGSETrfflTRRNSRWDKRTL 

EQEDSSSKKRKLL 


3926 


A 


99 . 


284 


MPREDRATWKSNYFLIOIQLLDDYPKRFIVGANN 
VGSKQMQQIRMSLRGKAWLMGKNTMMR 


3927 


A 


542 


2 


AHLLMLNLAL\TDLL\YLTSLPFLIHYYASGENWI 
FGDFMCKFIRFSFHFNLYSSILFLTCFSIFRYCVIIH 
PMSCFSIHKmCAWACAVVWIISLVA\OPMTFLI 
TSTNRTNRSACLDLTSSDELNTIKWYNLILTA\LL 
CLPLVIVTLCYTTIIHTLTHGHANXDSCLKQKARR 
LTILLL 


3928 


A 


1 


1516 


GEEAVGGGAEGGGFGVGAQGRAGGRGVEAGR 

MRLSKTLVDMDMADYSAALDPAYTTLEFENVQ 

VLTMGNDTSPSEGTNLNAPNSLGVSALCAICGDR 

ATGKHYGASSCDGCKGFFRRSVRKNHMYSCRFS 

RQCVWKDKRNQCRYCRLKKCFRAGMKKEAV 

QNERDRISTRRSSYEDSSLPSINALLQAEVLSRQIT 

SPVSGINGDIRAKKIASIADVCESMKEQLLVLVE 

WAKYIPGFCELPLDDQGALLRAHAGEHLLLGAT 

KRSMVFKDVLLLGNDYIVPRHCPELAEMSRVSIR 

ILDELVLPFQELQIDDNEYAYLKAIEFFDPDAKGL 

SDPGKIKRLRSQVQVSLEDYINDRQYDSRGRFGE 

LLLLLPTLQSITWQMIEQIQFIKLFGMAKIDNLLQ 

EMLLGGSPSDAPHAHHPLHPHLMQEHMGTNVIV 

ANTMPTHLSNGQMCEWPRPRGQAATPETPQPSP 

PGASGSEPYKLLPGAVA-nVKPLSAlPQPTlTKQE 

VI 


3929 


A 


1 


2782 


RVLSLESPLEKDPRVLGAQS VPRGRALKGLSPLG 

LDSAFRLFPDPRAGPWNTAVLSSGMEPETALWG 

PDLQGPEQSPNDAHRGAESENEEESPRQESSGEEI 

IMGDPAQSPESKDSTEMSLERSSQDPSVPQNPPTP 

LGHSNPLDHQPLDPPAPEVVPTPSDWTKACEAS 

WQWGALTTWNSPPVVPANEPSLRELVQGRPAG 

AEKPYICNECGKSFSQWSKLLRHQRIHTGERPNT 

CSECGKSFTQSSHLVQHQRTHTGEKPYKCPDCG 

KCFSWSSNLVQHQRTHTGEKPYKCTECEKAFTQ 

STNLIKHQRSHTGEKPYKCGECRRAFYRSSDLIQ 

HQATHTGEKPYKCPECGKRFGQNHNLLKHQKIH . 

AGEKPYRCTECGKSF1QSSELTQHQRTHTGEKPY 

ECLECGKSFGHSSTLIKHQRTEbLREDPFKCPVCG 

KTFTLSATLLRHQRTHTGERPYKCPECGKSFSVS 

SNLINHQRJHRGERPYICADCGKSFIMSSTLIRHQ 

RIHTGEKPYKCSDCGKSFIRSSHLIQHRRTHTGEK 

PYKCPECGKSFSQSSmTHVRraMDENLFVCSD 

CGKAFLEAHELEQHRVIHERGKTPARRAQGDSL 

LGLGDPSLLTPPPGAKPHKCLVCGKGFNDEGIFM 

QHQRJHIGENPYKNADGLIAHAAPKPPQLRSPRL 

PFRGNSYPGAAEGRAEAPGQPLKPPEGQEGFSQR 

RGLLSSKTYICSHCGESFLDRSVLLQHQLTHGNE 

KPFLFPDYRIGLGEGAGPSPFLSGKPFKCPECKQS 

FGLSSELLLHQKVHAGGKSSHKSPELGKSSSVLL 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Ly$ine, L^Leucine, M^Methionine, 
N^Asparagine, P^Proline, Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V«=Valine, W=Tryptophan, Y«Tyrosine, 
X-Un known, *=Stop codon, /= possible nucleotide deletion, 
^possible nucleotide insertion 










EHLRSPLGARPYRCSDCRASFLDRVALTRHQETH 
TQEKPPNPEDPPPEAVTLSTDQEGEGETPTPTESS 
SHGEGQNPKTLVEEKPYLCPECGAGFTEVAALLL 
HRSCHPGVSL 


3930 


A 


513 


273 


KTQETOIYISEffiFFPFLQGFGNLPICMAKTDLSLS 
HQPDKKGVPSDFILPISDVRASIGAGFIYPLVGTG 
SRESPLWL 


3931 


A 


16 - 


305 


KRRDFLSCWPAFTVLGEARGDQVDWSKLYRDT 
GL VKM SRKPRA S SPFSNNHP STPKRRGRGKHPLI 
PGPEALSKFPRQPIREKGPVKEVPGTKGSP 


3932 


A 


16 


305 


KRRDFLSCWPAFTVLGEARGDQVDWSKLYRDT 
GLVKMSRKPRASSPFSNNHPSTPKRRGRGKHPLI 
PGPEALSKFPRQPIREKGPVKEVPGTKGSP 


3933 


A 


1 


1546 


STHASEHWDSALQLAKHLAPDQIPFISKEYAIQLE 

FAGDYVNALAHYEKGITGDNKEHDEACLAGVA 

QMSIRMGDIRRGVNQALKHPSRVLKRDCGAILE 

NMICQFSEAAQLYEKGLYYDKAASVY1RSKNWA 

KVGDLLPHVSSPKMLQYAKAKEADGRYKEAVV 

AYENAKQWQSVIRIYLDHLN>3PEKAVNIVRETQ 

SLDGAKMVARFFLQLGDYGSAIQFLVMSKCNNE . 

AFTLAQQHNKMEIYADHG SEDTFNEDYQSIALY 

FEGEKRYLQAGKFFLLCGQYSRALKHFLKCPSSE 

.DNVAIEMAJEWGQAXDELLTNQLIDHLLGEND 

GMPI<X)AKYLFRLYMALKQYREAAQTAII1AREE 

QSAGNYRNAHDVLFSMYAELKSQKIKIPSEMAT 

NLMILHSYILVKJHVKNGDHMKGARMLIRVANN 

[SKFPSHIVPILTSTVTECHRAGLKNSAFSFAAML 

MRPEYRSKIDAKYKKKIEGMVRRPDISE[EEATTP 

CPFCKFLLPESELL 


3934 


A 


334 


1268 


PTRRPILPLTSPKAISVPSPLQGKQHTLVKSCLSVS 

GIGGFLVSLSSRMKLQTLAVSVTALKFWSAYVP 

CQTQDRDALRLTLEQIDLIRRMCASYSELELVTS 

AKALNDTQKLACLIGVEGGHSLDNSLSILRtFYM 

LGVRYLTLTHTCNTPWAESSAKGVHSFYNNISGL 

TDFGEKWAEMNRLGMMVDLSHVSDAVARRAL 

EVSQAPV1FSHSAARGVCNSARNVPDDILQLLEE 

ERWAFVMVSLFHGELIQWQPIRPMCSTVADHFD 

HIKAVMGSKFIGIGGDYDGAGKYRKKTTCKAPW 

RTSSRMSS 


3935 


A 


1 


883 


HETTPAVVQSVLLERGWNKFDKQEQNAEDWNL 

YWRTSSFRMTEHNSVKPWQQLNHHPGTTKLTR 

KDCLAKHLK1IMRRMYGTSLYQFIPLTFVMPNDY 

TKFV AEYFQERQMLGTKHS YWICKPAELSRGRG 

ILIFSDFKDFIFDDMYIVQKYISNPLLIGRYKCDLR 

IYVCVTGFKPLTIYVYQEGLVRFATEKFDLSNLQ 

NNYAHLTOSSINKSGASYEKIKEVIGHGCKWTLS 

RFFSYLRSWDVDDLLLWKKIHRMVILTILAIAPS 

VPFAANCFELFGFDILIDDNEFHRTG 


3936 


A 


203 


441 


HLAHSLGPLPKHYQYCVRYLYYQVTKDVIKEFA 
DDGVKYLELRSTPRRENATGMTKKTYVESILEGI 
KQSKQENLDIDV 



471 



WO 01/57190 



TABLE 7 



PCT/US01/04098 



SEQ ID NO: 


Position of end of 
Signal in Amino Acid 
Sequence 


MaxS nVtAXIMTJM 

SCORED 


lVTeanS fiVTean Krnrpt 


1 


19 


0.930 


0.680 


2- 


24 


0.964 


0.863 


3 


21 


0.990 


0.901 


4 


19 


0.981 


0.942 


5 


22 


0.991 


0.928 


6 


21 


0.956 


0.843 


8 


22 


0.913 


0.718 


9 


17 


0.997 


0.969 


11 


19 


0.930 


0.680 


13 


36 


0.983 


0.863 


14 


28 


0.935 


0.839 


15 


21 


0.997 


0.955 


16 


16 


0.983 


0.944 


17 


18 


0.989 


0.884 


19 


49 


0.996 


0.719 


20 


28 


0.972 


0.920 


21 


.23 


0.954 


0.905 


22 


46 


0.955 


0.568 


23 


26 


0.942 


0.654 


24 


19 


0.979 


0.941 


25 


34 


0.884 


0.565 


26 


33 


0.934 


0.584 


27 


17 


0.975. 


0.914 


28 


18 


0.980 


0.934 


29 


23 


0.928 


0.718 


30 


26 


0.978 


0.885 


32 


20 


0.946 


0.719 


33 


29 


0.933 


0.671 


35 


25 


0.996 


0.920 


36. 


26 


0.903 


0.579 


40 


19 


0.981 


0.942 


47 


25 


0.971 


0.909 


53 


22 


0.991 


0.928 


55 


24 


0.960 


0.808 


60 


19 


0.986 


0.967 


78 


22 


0.913 


0.718 


86 


20 


0.883 


0.555 


87 


24 


0.982 


0.889 


88 


17 


0.997 


0.969 


115 


19 


0.930 


0.680 


134 


36 


0.983 


0.863 


136 


17 


0.913 


0.696 


137 


19 


0.958 


0.905 


140 


28 


0.935 


0.839 


143 


32 


0.914 


0.740 


153 


21 


0.997 


0.955 


154 


25 


0.913 


0.583- 


155 


29 


0.972 


0.857 


169 


30 


0.977 


0.817 


170 


30 


0.977 


0.819 


171 


30 


0.977 


0.819 


175 


47 


0.926 


0.606 


176 


30 


0.968 


0.872 


177 


22 


0.957 


0.791 - 


192 


43 


0.930 


0.678 
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SEQ ID NO: 


Position of end of 
Signal in Amino Acid 
Sequence 


MaxS (MAXIMUM 
SCORE) 


MeanS (Mean Score) 


195 


19 


0.956 . 


0.860 


202 


21 


0.982 


0.871 


203 


24 


0.957 


0.870 


207 


23 


0.954 


0.905 


224 


46 


0.955 


0.568 


225 


26 


0.942 


0.654 


228 


45 * 


0.961 


0.839 


231 


28 


0.994 


0.937 


232 


28 


0.993 


0.896 


234 


19 


0.979 


0.942 


235 


19 


0.979 


0.941 


238 


20 


0.987 


0.943 


244 


23' 


0.929 


0.683 


250 


34 


0.884 


0.565 


256 


33 


0.934 


0.584 


258 


25 


0.934 


0.729 


259 


22 


0.969 


0.871 


264 


19 


0.952 


0.753 


265 


17 


0.975 


0.914 


266 


17 


0.975 


0.914 


271 


23 


0.974 


0.884 


274 


13 


0.971 


0.834 


275 


18 


0.980 


0.934 


278 


32 


0.958 


0.668 


280 


24 


0.966 


0.881 


281 


24 


0.966 


0.881 


286 


23 , 


0.928 


.0.718 


291 


35 


0.991 


0.824 


293 


27 


0.956 


0.806 


294 


23 


0.952 


0.827 


301 


26 


0.978 


0.885 


316 


20 


0.946 


0.719 


320 


28 


0.978 . 


0.726 


327 


29 


0.933 


0.671 


331 


48 


0.903 


0.571 


345 


25 


0.996 


0.920 


349 


26 


0.903 


0.579 


351 


24 


0.951 


0.876 


352 


18 


0.944 


0.716 


353 


32 . 


0.992 


0.854 


354 


27 


0.945 


0.817 


355 


16 


0.922 


0.716 


356 


13 


0.959 


0.818 


357 


23 


0.986 


0.878 


358 


19 


0.904 


0.671 


359 


16 


0.988 


0.951 


360 


15 


0.981 


0.938 


361 


18 


0.944 


0.716 


362 


21 


0.984 


0.869 


363 


40 


0.979 


0.813 


364 


18 


0.883 


0.693 


365 


22 


0.962 


0.908 . 


366 


22 


0.961 


0.827 


367 


44 


0.941 


0.624 


368 


20 


0.952 


0.791 


369 


22 


0.949 


0.840 


370 


28 


0.957 


0.682 
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SEQ ID NO: 


Position of end of 
Signal in Amino Acid 
Sequence 


MaxS (MAXIMUM 
SCORE) 


MeanS (Mean Score) 


372 


28 


0.974 


0.894 


373 


19 


0.972 


0.947 


374 


29 


0.968 


0.785 


375 


19 


0.949 


0.897 


377 


23 


0.962 


0.910 


378 


31. 


0.974 


0.895 


379 


26 


0.969 


0.939 


380 


27 


0.945 


0.817 


383 


27 


0.945 


0.817 


384 


25 


0.992 


0.877 


385 


32 


0.983 


0.825 


386 


44 


0.924 


0.564 


387' 


26 


0.971 


0.894 


388 


19. 


0.989 


0.862 


389 


24 


0.990 


0.947 


390 


34 


0.942 


0.635 


391 


16 . 


0.922 


0.716 


394 


19 


0.987 


0.970 


398 


36 


0.992 


0.866 


404 


13 


0.959 


0.818 


417 


23 


0.986 


0.878 


421 


19 


0.904 


0.671 


425 


28 


0.971 


0.717 


431 


16 


0.988 


0.951 


452 


18 


0.944 


0.716 


459 


21 


0.991 


0.902 


468 


21 


0.984 


0.869 


478 


40 


0.979 


0.813 


486 


18 


0.883 


0.693 


499 


22 


0.962 


0.908 


501 


19 


0.962 


0.877 


514 


44 


0.941 


0.624 


529 


20 


0.952 


0.791 


533 


39 


0.914 


0.719 


548 


28 


0.957 


0.682 


561 


28 


0.974 


0.894 


562 


28 


0.974 


0.893 


564 


18 


0.949 


0.806 


576 


19 


0.972 


0.947 


584 


29 


0.968 


0.785 


585 


28 


0.973 


0.810 


591 


19 


0.949 


0.897 


592 


24 


0.991 


0.954 


594 


20 


0.985 


0.959 


595 


20 


0.985 


0.959 


612 


23 


0.962 


0.910 


619 


31 


0.974 


0.895 


621 


15 


0.959 


0.795 


633 


26 . 


0.969 


. 0.939 


640 


20 


0.949 


0.842 


645 


25 


0.911 


0.759 


684 


25 


0.992 


0.877 


691 


32 


0.983 


0.825 


698 


44 


0.924 


0.564 


700 


19 


0.982 


0.941 


710 


26 


0.971 


0.894 


714 


23 


0.965 


0.907 
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SEQIDNO: 


Position of end of 
Signal in Amino Acid 
Sequence 


MaxS (MAXIMUM 
SCORE) 


MeanS (Mean Score) 


718 


19 


0.989 


0.862 


725 


21 


0.976 


0.851 


728 


33 


0.961 


0.895 


734 


25 


0.963 


0.660 


741 


34 


0:942 


0.635 


744 


19 


0.959 


0.924 


747 


16 


0.922 


0.716 


756 


26 


0.973 


0.864 


767 


22 


0.986 


0.943 


768 


27 


0.916 


0.758 


769 


19 


0.987 


0.970 


770 


22 . 


0.981 


0.933 


771 


34 


0.993 


0.893 


773 


20 


0.968 


0.939 


774 


21 


0.971 


0.945 


778 


22 


0.986 


0.943 


779 


32 


0.973 


0.846 


781 


23 


0.950 


0.857 


785 


27 


0.916 


0.758 • 


786 


27 


0.916 


0.758 


788 


22 


0.981 


0.933 


793 


22 


0.986 


0.803 


794 


39 


0,892 


0.654 


797 


27 


0.965 


0.847 


810 


22 


0.981 


0.933 


823 


34 


0.993 


0.893 


825 


17 


0.962 


0.778 


837 


20 


0.968 


0.939 


844 


25 


0.984 


0.951 


845 


17 


0.919 


0.706 


846 


21 


0.971 


0.945 


847 


21 


0.971 


0.945 


890 


22 


0.986 


0.943 


893* 


24 


0.971 


0.865 


894 


24 


0;971 


0.865 


896 


32 


0.973 


0.846 


899 


31 


0.982 


0.817 


922 


15 


0.882 


0.706 


924 


21 


0.975 


0.948 


925 


21 


0.927 


0.661 


933 


20 


0.967 


0.906 


960 


20 


0.967 


0.906 


967 


38 


■0.970 


0.784 


968 


47 


0.970 


0.557- 


972 


36 


0.945 


0.775 



TABLE 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location 

corresponding to 
first amino acid 
residue of 
peptide sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic 
Acid, E=Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Iso leucine, K=Lysine, L= Leu cine, 
M=Methionine, N s =Asparagine T P=ProIine, Q=Glutamine, 
R=Arginine, S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Un known, *=Stop cod on, 
/=possible nucleotide deletion, V=possible nucleotide 
insertion 


3955 


A 


235 


1272 


GPREVLAASSLADGSEEQVMAVALVRERDLSFPG 
VGDAVVNPTRWHLPAQPEMLYEGGEGRMETLK 
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SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 

corresponding to 
first amino acid 
residue of 
peptide sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic 
Acid, E=Glutamic Acid, ^Phenylalanine, G=Glycine, 
ENHistidine, I=Isoleucine, K=Lysine, L=Leucine, 
ivi— lvietnionine, is— Asparaginc, r=,rronne, Q^Crlutamine, 
R=Arginine, S=Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possiblc nucleotide 
insertion 










DKTLQELEELQKDSEAIDQLALESPEVQDLQLERE 

MALATNRSLAERKLEFQGPLEISRSNLSDRYQELR 

KLVERCQEQKAKLEKFSSALQPGTLLDLLQVEGM 

KIEEESEAMAEKFLEGEVPLETFLENFSSMRMLSH 

LRRVRVEKLOEV VRKPRASOEL AGD APPPR SPPP 

V/PPSPPGNTPCG*RAAAAT1SHASLPFALQPIPQPA 

CGPHCPV^SPATGPFPSSVPALLLQRASGPHLPGSP 

AWTQGCCGLLLVPTEEHAAPPYGFPPPPGPAWPG 

Y 


3956 


A : 


821 


385 


SICADRTERVGTFFYIP AGTTDF A TWTRP*FfiTI<i V7 
SNHAGIQRSSRP/SHYQGE/WHDNCFTADELQLLT 
YQLCHTYVRCmSVSIPAPAYYAHLVAFRARYHL 
VDKEHDSAEGSHVSGQSNGRDPQALAKAVQIHQ 
DTLRTMYFA 


3957 


A 


4621 


240 


ELISTFKLLLEKKRSEVMKMKKRYEVGLEKLDSA 

SSQVATMQMELEALHPQLKVASKEXODEMMIMIE 

KESVEVAKTEKJVKADETIANEQAMASKAIKDEC 

DADLAGALPILESALAALDTLTAQDITWKSMKSP 

PAGVKLVMEAICILKGIKADKIPDPTGSGKKIEDF 

WGPAIOnJLLGDMRFLQSLHEYDKDNPPAYMNIIR 

knyipnpdf\tpekirn asta aeglckwviamd sy 

, dkvakiv apkk1klaaaegelkiamdglrkkqa 

alkevqdklarlqdtlelnkqkkadlenqvdlc 

skkleraeqligglggektrwshtalelgqlyin 

ltgdilissgvvaylgaftstyrqnqtkjewttlck 

grd1pcsddcslmgtlgeavtirtwniaglpsdsf 

s]i)nginnwarrwplmidpqsqankwiknmeka 

nslyviklsepdyvrtlenciqfgtpvllenvgee 

ldpileplllkqtfkqggstcirlgdsiteyapdfr 

fy1ttklrnphylpetsvkvtllnfmitpegmqdq 

llgivvaqerpdleeekqalilqgaenkrqlkeie 

dkjlevlsssegniledetaikilssskalaneisqk 

qevaeetekkidttrmgyrpiaihssilffsladla 

niepmyqysltwfinlfilsiensekseilakrlqil 

kdhfiyslyvnvcrslfekidl^lfsfclitnlllh 

erainkaewrflltggigldnpyanpctwlpqks 

wdeicrlddlpafktirrefmrlkdgwkkvydsl 

ephhevfpeewedkanefqrmliirclrpdkvipm 

lqefiinrlgrafmpppfdlakafgdsnccaplifv 

lspgadpmaallkfaddqgyggsklsslslgqgq 

gpiamkmlekavkegtwvvlqnchlatswmpt 

lekvceelspesthpdfrmwltsypspnfpvsvlq 

ngvkmtneapkglraniirsylmdpisdpeffgsc 

kkpeefkkllyglcffhalvqerrkfgplwwnip 

yefnetdlrisvqqlhmflnqyeelpyealrymt 

gecnyggrvtdd wdrrtlrsilnkffnpel vens 

dykfdssgiyfvppsgdhksyieytktlpltpapei 

fgmnanaditkdqsetqllfdnilltqsrsagag 

akssdevvnevasdilgklpnnfdieaamrrypt 

TYTQSMNTVLVQEMGIUl^LKTniDSCVNIQKA 

IKGLAVMSTDLEEWSSILNViaPEMWMGKSYPS 

LKPLGSYVNDFLARLKFLQQWYEVGPPPVFWLSG 

FFFTQAFLTGAQQNYARKYTIPIDLLGFDYEVMED 

KEYKHPPEDGVFIHGLFLDGASWNRKIKICLAESH 

PKILYDTVPVMWLKPCKRADIPICRPSYVAPLYKT 
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SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 

corresponding to 
first amino acid 
residue of 
peptide sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 

dvlU I CO I UUt; 

of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic 
Acid, E=GIutamic Acid, F=Phenylalanine, G=G!ycine, 
H=Histidine, I=Isoleucine, K=Lysine, L=Leucine, 
ivi— ivxeunonine, x>— /isparagme, r— Jrroune, Vi^v>lutamine, 
R=Arginine, S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Un known, *=Stop codon, 
/=possible nucleotide deletion, \=possible nucleotide 
insertion 










SERRGVLSTTGHSTNFVIA\MTLPSDQPKEHWIGR 
GVALLCQLNS 


3958 


A 


35 


529 


GADMAXSKNHTTHNOSRKWHRNVIKKPLSORYK 

SLKGVDPKFLGNMCFTKJKHKKKGLKKMQADSA 

KAVSTCAKAIEALVKPKEVKPKIPKGVSCELN*LA 

YIAYPKFWTCACACIAKGLRLCQPKAKAQDQTK 

AQVQIKAQAAAPASVPTQAPKGAQAPTKASG 


3959 


A . 


1883 


763 


LLVLLLRTNLLIASSTRISRATLTCSPPGIPVDPRVR 

PRVRSHLVMYLGITTGSLHKAVVSGDSSAHLVEEI 

QLFPDPEPVRNLQLAPTQGAVFVGFSGGVWRVPR 

ANCSV YESCVDC VLARDPHC A WDPESRTCCLLS A . 

PNLNSWKQDMERGNPEWACASGPMSRSLRPQSR 

PQIIKEVLAVPNSILELPCPHLSALASYYWSHGPAA 

VPEASSTVYNGSLLLIVODGVGGLYOCWATENGF 

SYPVISYWVDSQDQTLALDPELAGIPREHVKVPLT 

RVSGGAALAAQQSYWPOTVTVTVLFALVLSGALI 

1LVASPLRALRARGKVQGCETLRPGEKAPLSREQH 

LQSPKECRTSASDVDADNNCLGTEVA 


3960 


A 


1 • 
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SYAAPSLFVKSLYWALAFMAVLLAVSGVVIWLA 
SRAGARCQQCPPGWVLSEEHCYYFSAEAQAWEA 
SQAFCSAYHATLPLLSHTQDFLGRYPVSRHSWVG 
AWRGPQGWHWIDEAPLPPQLLPEDGEDNLDINCG 
ALEEGTLVAANCSTPRPWVCAKGTQ 



TABLE 9 



SEQ ID NO: 


Accession 
Number 


Species 


Description 


Smith 

Waterman 

Score 


% Idenity 


3937 


Y27700 


Homo sapiens 


Human secreted 
protein encoded by 
gene No. 12. 


193 


25 


3938 


AF093097 


Homo sapiens 


putative RNA-binding 
protein Q99 


3881 


84 


3939 


AB012308 


Anthocidaris 
crassispina 


B2HC 


4169 


74 


3940 


U 10248 


Homo sapiens 


ribosomal protein L29 


787- 


95 


3941 


Y99418 


Homo sapiens 


Human PR01317 
(UNQ783) amino acid 
sequence SEQ ID 
NO:277. 


4031 


100 


3942 ! 


AL023516 


Gallus gallus 


B locus C type Lectin 


198 


35 



TABLE 10 



SEQID 
NO: 


Accession No. 


Description 


Results* 


3937 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.168e-ll 209- 
224 


3942 


BL00615 


C-type lectin domain proteins. 


BL00615A 16.68 6.400e-ll 37- 
55 



* Results Include in order: accession number subtype; raw score; p-value; position of signature in amino acid 
sequence 
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TABLE 11 



SEQID 
NO: 


PFAM Name 


Description 


P-Value 


PFAM 
Score 


3938 


Piwi 


Piwi domain 


2.6e-150 


512.7 


3940 


Ribosomal L29e 


Ribosomal L29e protein family 


2.3e-19 


77.8 


3941 


Sema 


Sema domain 


4e-181 


615.1 


3942 - 


lectin c 


Lectin C-type domain 


0.086 


-7.1 



TABLE 12 





SEQ ID NO: 


Position of end of 
Signal in Amino Acid 
Sequence 


MaxS (Maximum Score) 


Means (Mean Score) 




3941 


31 


0.985 


0.926 . 




3942 


21 


0.974 


0.894 


10 


TABLE 13 



SEQ ID NO: 
of full length 
nucleotide 
sequence 


SEQID 
NO: of full 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID NO: 
of contig 
peptide 
sequence 


Priority Docket 
number 

corresponding SEQ 
ID NO: in priority 
application 


SEQ ID NO: in 
USSN 09/496,914 


3937 


3943 


3949 


3955 


787CIP2G 1 


787 3587 


3938 


3944 


3950 


3956 


787CIP2G 2 


787 3813 


39,39 


3945 


3951 


3957 


787CIP2G 3 


787_4462 


3940 


3946 


3952 


3958 


787CIP2G 4 . 


787 4887 


3941 


3947 


3953 


3959 


787CIP2G 5 


787_5794 


3942 


3948 


3954 


3960 


787CIP2G 6 


787 8743 



TABLE 14 



TISSUE ORIGIN 


LIBRARY/ 


HYSEQ LIBRARY 


SEQIDNOS: 




RNA SOURCE 


NAME 




adult brain 


GIBCO 


ABD003 


3940 


adult brain 


Clontech 


ABR006 


3940 


adult brain 


Invitrogen 


ABR014 


3940 


cultured preadipocytes 


Strategene 


ADP001 


3937 


adult heart 


GIBCO 


AHR001 


3940 


adult kidney 


GIBCO 


AKD001 


3940 


adult lung 


GIBCO 


ALG001 


3940 


young liver 


GIBCO 


ALV001 


3940 


adult ovary 


Invitrogen 


AOV001 


3938, 3940-3941 


adult spleen 


GIBCO 


ASP001 


3940-3941 . 


testis 


GIBCO 


ATS001 


3940 , 


bone marrow 


Clontech 


BMD001 


3938, 3940 


bone marrow 


Clontech 


BMD004 


3940 


adult cervix 


BioChain 


CVX001 


3940 


endothelial cells 


Strategene 


EDT001 


3940 


fetal brain 


Clontech 


FBR006 


3940 


fetal brain 


Invitrogen 


FBT002 


3940-3941 


fetal heart 


Invitrogen 


FHR001 


3940 


fetal kidney 


Clontech 


FKD001 


3940 


fetal kidney 


Clontech 


FKD002 


3940 
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TISSUE ORIGIN 


LIBRARY/ 
RNA SOURCE 


HYSEQ LIBRARY 
NAME 


SEQIDNOS: 


fetal liver-spleen 


Columbia 
Univer^itv 

will r Vii JILj 


FLS001 


3937, 3940 


fetal liver- spleen 


Columbia 
University 


FLS002 


3938 3941 


fetal liver-spleen 


Columbia 
University 


FLS003 


3940 


fetal liver 


Clontech 


FLV004 


3940 


fetal skin 


Invitrogen 


FSK001 


3940-3942 


fetal spleen 


BioChain 


FSP001 


3940 


fetal brain 


GIBCO 


HFB001 


3937 3940-3941 


infant brain 


Columbia 
University 


IB2002 


3937, 3939, 3941 


leukocyte 


GIBCO 


LUC001 


3940-3941 


leukocyte 


Clontech 


LUC003 


3940-3941 


melanoma from cell line ATCC 
#CRL1424 


Clontech 


MEL004 


3940 


mammary gland 


Invitrogen 


MMG001 


3937, 3940-3941 


neuronal cells 


Strategene 


NTU001 


3937 3942 


prostate 


Clontech 


PRT001 


3938 


rectum 


Invitrogen . 


REC001 


3940 


salivary gland 


Clontech 


SALs03 


3941 


small intestine 


Clontech 


SIN001 


3940 


skeletal muscle 


Clontech 


SKM001 


3940 


spinal cord 


Clontech 


SPC001 


3940 


thymus 


Clontech 


THMc02 


3938 


thyroid gland 


Clontech 


THR001 


3942 . 


uterus 


Clontech 


UTR001 


3940 
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WHAT IS CLAIMED IS: 

1 . An isolated polynucleotide comprising a nucleotide sequence selected from the group 
consisting of SEQ ID NO:l-984, 1969-2952, 3937-3942 or 3949-3954, a full length protein 
coding portion of SEQ ID NO:l-984, 1969-2952, 3937-3942 or 3949-3954, a mature protein 
coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, an active domain 
coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, and complementary 
sequences thereof. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, wherein said 
polynucleotide hybridizes to the polynucleotide of claim 1 under stringent hybridization 
conditions. 

3. An isolated polynucleotide encoding a polypeptide with biological activity, wherein said 
polynucleotide has greater than about 90% sequence identity with the polynucleotide of claim 1. 

4. The polynucleotide of claim 1 wherein said polynucleotide is DN A. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the; 
complementary sequences., 

6. A vector comprising the polynucleotide of claim 1 . 

7. . An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 operatively 
associated with a regulatory sequence that modulates expression of the polynucleotide in the host 
cell. - 

10. An isolated polypeptide, wherein the polypeptide is selected from the group consisting of: 

(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; and 

(b) a polypeptide encoded by a polynucleotide hybridizing under stringent conditions 
with any one of SEQ ID NO: 1-984, 1969-2952, 3937-3 942 or 3949-3954. 



480 



WO 01/57190 PCT7US01/04098 

11. A composition comprising the polypeptide of claim 10 and a carrier. 

12. An antibody directed against the polypeptide of claim 10. 

13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a complex 
with the polynucleotide of claim 1 for a period sufficient to form the complex; and 

b) . detecting the complex, so that if a complex is detected, the polynucleotide 
of claim 1 is detected. . 

14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions with 
nucleic acid primers that anneal to the polynucleotide of claim 1 under such conditions; 

b) amplifying a product comprising at least a portion of the polynucleotide of 

claim 1; and 

c) detecting said product and thereby the polynucleotide of claim 1 in the 

sample.. 

15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the method 
further comprises reverse transcribing an annealed RNA molecule into a cDNA polynucleotide. 

1 6. A method for detecting the polypeptide of claim 1 0 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a complex 
with the polypeptide under conditions and for a period sufficient to form the complex; and 

b) detecting formation of the complex, so that if a complex formation is 
detected, the polypeptide of claim 10 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the : polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound complex is 
detected, a compound that binds to the polypeptide of claim 10 is identified. 
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18. A method for identifying a compound that binds to the' polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a cell, under 
. conditions sufficient to form a polypeptide/compound complex, wherein the complex drives 

expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence expression, so 
that if the polypeptide/compound complex is detected, a compound that binds to the polypeptide 
of claim 10 is identified. 

19. A method of producing the polypeptide of claim 10, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected fromm 
the group consisting of SEQ ID NO: 1-984; 1969-2952, 3937-3942 or 3949-3954, a mature 
protein coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, an active 
domain coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, 
complementary sequences thereof and a polynucleotide sequence hybridizing under stringent 
conditions to SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, under conditions 
sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 

20. An isolated polypeptide comprising an amino acid sequence selected from the group 
consisting of any one of the polypeptides SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 
3955-3960, the mature protein portion thereof, or the active domain thereof. 

2 1 . The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide array. 

•■ " r. . ■ . ' ' ' : - ■ ■ ■ 

22. A collection of polynucleotides, wherein the collection comprising the sequence 
information of at least one of SEQ ED NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid array. 

24. The collection of claim 23, wherein the array detects full-matches to any one of the 
polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of the 
polynucleotides in the collection. 
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26. The collection of claim 22, wherein the collection is provided in a computer-readable 
format. 

27. ' A method of treatment comprising administering to a mammalian subject in need thereof 
a therapeutic amount of a composition comprising a polypeptide of claim 10 or 20 and a 
pharmaceutically acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need thereof 
a therapeutic amount of a composition comprising an antibody that specifically binds to a 
polypeptide of claim 1 0 or 20 arid a pharmaceutically acceptable carrier. 
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